RTL Modeling With: Systemverilog
RTL Modeling With: Systemverilog
SystemVerilog
for Simulation and Synthesis
using SystemVerilog for ASIC and FPGA design
6/15/17
Other books authored or co-authored by Stuart Sutherland:
Verilog and SystemVerilog Gotchas: 101 Common Coding Error and How to Avoid
Them
Common coding mistakes and guidelines on how to write correct code. Co
authored with Don Mills.
System Verilog For Design: A Guide to Using System Verilog fo r Hardware Design
and Modeling, Second Edition
Describes what SystemVerilog-2005 added to the Verilog-2001 language for RTL
modeling. Assumes the reader is familiar with Verilog-2001. Written by Stuart
Sutherland, with advice and contributions from Simon Davidmann and Peter
Flake. Includes an appendix with a detailed history of Hardware Description
Languages by Peter Flake.
Verilog-2001: A Guide to the New Features in the Verilog Hardware Description
Language
Describes what Verilog-2001 added to the original Verilog-1995 language.
Assumes the reader is familiar with Verilog-1995.
The Verilog PLI Handbook: A Tutorial and Reference Manual on the Verilog Pro
gramming Language Interface, Second Edition
A comprehensive reference and tutorial on Verilog-2001 PLI and VPI program
ming interfaces into Verilog simulation.
Verilog HDL Quick Reference Guide, based on the Verilog-2001 Standard
A concise reference on the syntax of the complete Verilog-2001 language.
Verilog PLI Quick Reference Guide, based on the Verilog-2001 Standard
A concise reference on the Verilog-2001 Programming Language Interface, with
complete object relationship diagrams.
RTL Modeling with
SystemVerilog
for Simulation and Synthesis
using SystemVerilog for ASIC and FPGA design
Stuart Sutherland
published by:
Sutherland HDL, Inc.
Tualatin, Oregon, USA
sutherland-hdl.com
printed by:
CreateSpace, An Amazon.com Company
eStore: www. Create Space, com/7164313
ISBN-13: 978-1-5467-7634-5
ISBN-10: 1-5467-7634-6
Copyright © 2017, Sutherland HDL, Inc.
All rights reserved. This work may not be translated, copied, or reproduced in whole
or in part without the express written permission of the copyright owner, except for
brief excerpts in connection with reviews or scholarly analysis. Use in connection
with any form of information storage and retrieval, electronic adaptation, computer
software, or by similar or dissimilar methodology now known or hereafter developed
is forbidden.
The use in this work of trade names, trademarks, service marks, and similar terms,
even if they are not identified as such, is not to be taken as an expression of opinion as
to whether or not they are subject to proprietary rights.
Dedication
To my wonderful wife, LeeAnn, and my children, Ammon, Tamara, Hannah, Seth and
Samuel, and each o f their families — Families are forever!
Stuart Sutherland
Portland, Oregon, USA
VII
Table of Contents
Table of Contents........................................................................................................... ix
List of Figures................................................................................................................xxi
Foreword....................................................................................................................... xxv
Preface......................................................................................................................... xxvii
Why this book........................................................................................................................xxvii
Intended audience for this book............................................................................................xxviii
Topics covered in this book..................................................................................................xxviii
Book examples........................................................................................................................ xxix
Obtaining copies of the examples............................................................................................ xxx
Simulators and synthesis compilers used in this book.............................................................xxx
Other sources of information.................................................................................................. xxxi
Acknowledgements................................................................................................................. xxxi
Index 441
XVII
List of Examples
This book contains a number o f examples that illustrate the proper usage o f System-
Verilog constructs. A summary o f the major code examples is listed in this section. In
addition to these examples, each chapter contains many code fragments, referred to as
snippets, that illustrate specific features o f SystemVerilog. The source code for the
full examples can be downloaded from http://www.sutherland-hdl.com. Navigate the
menus to “SystemVerilog Book Examples
The Preface provides more details regarding the code examples in this book.
Example 4-7: Arithmetic Logical Unit (ALU) with structure and union ports................... 135
Example 4-8: Using arrays of structures to model an instruction register........................... 137
List of Figures
Figure 5-16: Synthesis result for Example 5-14: Streaming operator (bit reversal) ............ 183
Figure 5-17: Synthesis result for Example 5-15: Arithmetic operation, unsigned .............. 187
Figure 5-18: Synthesis result for Example 5-16: Arithmetic operation, signed .................. 187
Figure 5-19: Synthesis result for Example 5-18: Increment and decrement operators......... 193
Figure 5-20: Synthesis result after mapping to a Xilinx Virtex®-7 FPG A ........................... 193
Figure 5-21: Synthesis result after mapping to a Xilinx CoolRunner™-II CPLD ............... 194
Figure 5-22: Synthesis result for Example 5-19: Assignment operators .............................. 197
Figure 5-23: Synthesis result for Example 5-20: Size casting ............................................. 205
Figure 5-24: Synthesis result for Example 5-21: Sign casting ............................................. 208
Foreword
by Phil M oorby
The creator o f the Verilog language
Verilog is now over 30 years old, and has spanned the years of designing with
graphical schematic entry tools of a few thousand gates, to modem RTL design using
tools supporting millions, if not billions, of gates, all following the enduring predic
tion of Moore's law. Verilog addressed the simulation and verification problems of the
day, but also included capabilities that enabled a new generation of EDA technology
to evolve, namely synthesis from RTL. Verilog thus became the mainstay language of
IC designers.
Behind the scenes, there has been a steady process of inventing and learning what
was needed and what worked (and what did not work!) to improve the language to
keep up with the inevitable growth demands. From the public's point of view, there
were the stepping-stones from one published standard to the next: the first published
standard in 1995, the eagerly awaited update of Verilog in 2001, the final of the older
Verilog standard in 2005, and the matured System Verilog standard in 2012, just to
name some of the main stones.
I have always held the belief that for hardware designers to achieve their best in
inventing new ideas they must think (if not dream) in a self contained, consistent and
concise language. It is often said when learning a new natural language that your
brain doesn't get it until you realize that you are speaking it in your dreams.
Over the last 15 years, Verilog has been extended and matured into the System Ver
ilog language of today, and includes major new abstract constmcts, test-bench verifi
cation, formal analysis, and C-based API’s. SystemVerilog also defines new layers in
the Verilog simulation strata. These extensions provide significant new capabilities to
the designer, verification engineer and architect, allowing better teamwork and co
ordination between different project members. As was the case with the original Ver
ilog, teams who adopt SystemVerilog based tools will be more productive and pro
duce better quality designs in shorter periods. Many published textbooks on the
design side of the new SystemVerilog assumed that the reader was familiar with Ver
ilog, and simply explained the new extensions. It is time to leave behind the stepping-
stones and to teach a single consistent and concise language in a single book, and
maybe not even refer to the old ways at all!
XXVI RTL Modeling with SystemVerilog for Simulation and Synthesis
P hil Moorby,
M ontana Systems, Inc.
M assachusetts, 2016
X X V II
Preface
W hy this book
1. Chris Spear and Greg Tumbush, “SystemVerilog for Verification, Third Edition”, New York, NY:
Springer 2012, 978-1-4614-0715-7.
X X V III RTL Modeling with SystemVerilog for Simulation and Synthesis
NOTE
This book assumes the reader is already familiar with digital logic design.
The text and examples in this book assume and require an understanding of digital
logic. Concepts such as AND, OR and Exclusive-OR gates, multiplexors, flip-flops,
and state machines are not defined in this book. This book can be a useful resource in
conjunction with learning and applying digital design engineering skills.
Chapter 8 examines the correct way to model RTL sequential logic behavior. Topics
include synchronous and asynchronous resets, set/reset flip-flops, chip-enable flip-
flops, and memory devices, such as RAMs.
Chapter 9 presents the proper way to model latches in RTL models, and how to avoid
unintentional latches.
Chapter 10 discusses the powerful interface construct that SystemVerilog adds to tra
ditional Verilog. Interfaces greatly simplify the representation of complex buses and
enable the creation of more intelligent, easier to use IP (intellectual property) models.
Appendix A summarizes the best-practice coding guidelines and recommendations
that are made in each chapter of the book.
Appendix B lists the set of reserved keywords for each generation of the Verilog and
SystemVerilog standards.
Appendix C is a reprint of a paper entitled I ’m Still In Love With My X, regarding
how X values propagate in RTL models. The paper recommends ways to minimize or
catch potential problems with X-optimism and X-pessimism in RTL models.
Appendix D lists some additional resources that are closely related to the topics dis
cussed in this book.
Book examples
The examples in this book illustrate specific SystemVerilog constructs in a realistic,
though small, context. Complete code examples list the code between two horizontal
lines, as shown below. This book use a convention of showing all SystemVerilog key
words in bold.
SystemVerilog RTL model of 32-bit adder/subtractor (same as Example 1-3, page 11)
module rtl_adder_subtractor
(input logic elk, // 1-bit scalar input
input logic mode, // 1-bit scalar input
input logic [31:0] a, b, // 32-bit vector inputs
output logic [31:0] sum // 32-bit vector output
);
always_ff 0 (posedge elk) begin
if (mode == 0) sum <= a + b;
else sum <= a - b;
end
endmodule: rtl adder subtractor
Each chapter also contains many shorter examples, referred to a code snippets.
These snippets are not complete models, and are not encapsulated between horizontal
lines. The full source code, such as variable declarations, is not included in these code
XXX RTL Modeling with SystemVerilog for Simulation and Synthesis
snippets. This was done in order to focus on specific aspects of SystemVerilog con
structs without clutter from surrounding code.
NOTE
This book strives to be vendor and software tool neutral. While specific
products were used to test the examples in this book, all examples should run
with any simulator or synthesis compiler that adheres to the IEEE 1800-2012
SvstemVerilog standard.
The examples in this book have been tested with multiple simulation and synthesis
tools, including (listed alphabetically by company name):
Acknowledgements
I am grateful to all those who have helped with this book. I would like to specifi
cally thank those that provided invaluable feedback by reviewing specific chapters
the book for technical content and accuracy. These reviewers include: Leah Clark,
Clifford Cummings, Steve Golson, Kelly Larson, Don Mills and Chris Spear. I am
also grateful to Shalom Bresticker, who answered many technical questions over the
period of time that I wrote this book.
Special recognition is extended to Don Mills, who provided valuable feedback and
assistance throughout the writing process. Don recommended ideas for many of the
book examples, and helped with testing the code examples on multiple simulators and
synthesis compilers.
I am especially appreciative of Phil Moorby, the creator of the original Verilog lan
guage and simulator, for writing the foreword for this book and for creating a long-
lasting design and verification language for the digital design industry.
I would also like to recognize and thank my wonderful wife, LeeAnn Sutherland,
for her painstaking reviews of this book for grammar, punctuation and readability.*
* * *
1
Chapter 1
SystemVerilog Simulation and Synthesis
Abstract — This chapter explores the general concepts of modeling hardware using
SystemVerilog, and the roles of simulation and synthesis in the hardware design flow.
Some of the major topics presented in this section are:
• The difference between Verilog and SystemVerilog
• RTL and gate-level modeling
• Defining an RTL synthesis subset of SystemVerilog
• Modeling ASICs and FPGAs
• Model verification testbenches
• The role and usage of digital simulation with SystemVerilog
• The role and usage of digital synthesis with SystemVerilog
• The role and usage of SystemVerilog lint checkers
Verilog and System Verilog are synonymous names for the same Hardware Descrip
tion Language (HDL). SystemVerilog is the newer name for the official IEEE lan
guage standard, and replaces the original Verilog name.
Verilog began as a proprietary design language in the early 1980s, for use with a
digital simulator sold by Gateway Design Automation. The proprietary Verilog HDL
was opened to the public domain in 1989, and standardized by the IEEE as an interna
tional standard in 1995 as IEEE Std 1364-1995™ (commonly referred to as “Ver-
ilog-95”). The IEEE updated the Verilog standard in 2001 as the 1364-2001™
standard, referred to as “Verilog-2001”. The last official version under the Verilog
name was IEEE Std 1364-2005™. In that same year, the IEEE released an extensive
set of enhancements to the Verilog HDL. These enhancements were initially docu
mented under a different standards number and name, the IEEE Std 1800-2005™
SystemVerilog standard. In 2009, the IEEE terminated the IEEE-1364 standard, and
merged Verilog-2005 into the SystemVerilog standard, with the standards number
IEEE Std 1800-2009™ standard. Additional design and verification enhancements
were added in 2012, as the IEEE Std 1800-2012™ standard, referred to as System-
2 RTL Modeling with SystemVerilog for Simulation and Synthesis
Verilog-2012. At the time this book was writting, the IEEE was nearing completion
of a proposed IEEE Std 1800-2017™ standard, or SystemVerilog-2017. This version
only corrects errata in the 2012 version of the standard, and adds clarifications on the
language syntax and semantic rules.
It is important to note that the Accellera SystemVerilog 3.1 document was not a
complete, stand-alone language. It was a set of extensions to the IEEE 1364-2001
Verilog language. Accellera’s initial intent was that the IEEE would then add these
extensions to the next version of the IEEE 1364 Verilog standard, targeted to be 1364-
2005, nicknamed Verilog-2005. For multiple reasons, however, the IEEE Verilog
standards committee decided not to immediately merge these extensions into the
actual Verilog 1364 standard. Instead, the IEEE assigned a new standards number to
these extensions. In 2005, the IEEE released the 1364-2005 Verilog standard and, at
the same time, the 1800-2005 SystemVerilog extensions to Verilog standard.
Figure 1-2 shows the major features that SystemVerilog added to Verilog-2001. The
figure also shows that 4 features were incorporated into the Verilog 1364-2005 docu
ment, instead of the SystemVerilog 1800-2005 standard. Figure 1-2 does not delineate
between the 2005, 2009, 2012 and 2017 versions of SystemVerilog. Most of the new
capabilities that SystemVerilog added to traditional Verilog were made in the System-
Verilog-2005 version. Only a small number of additional features were added in the
2009 and 2012 versions, and nothing new as added in the 2017 version.
1800-2009 SystemVerilog standard. At that time, the IEEE terminated the old Ver-
ilog-1364 standard. The name “Verilog” officially became “SystemVerilog”.
The complexity of hardware designs and verifying those designs continues to
evolve, and the IEEE continues to evolve the SystemVerilog standard to keep pace. In
2012, the IEEE released an 1800-2012 SystemVerilog standard. At the time this book
was written, the IEEE was working on an 1800-2017 version of SystemVerilog. The
SystemVerilog-2017 version primarily makes clarifications to the SystemVerilog
standard, and does not add any new language features to the 2012 standard.
This book is based on the 2012/2017 versions of SystemVerilog.
The IEEE’s decision in 2005 to release two separate standards — one containing
the traditional Verilog language (1364-2005), and one containing only extensions to
Verilog and called SystemVerilog (1800-2005) — has been a source of confusion
among engineers. One common misconception that seems to persist is that Verilog is
a hardware modeling language and SystemVerilog is a verification language. This
misconception is not true! The original Verilog language was always an integrated
modeling and verification language. SystemVerilog extended, in substantial ways,
both the modeling aspects and the verification aspects of the original Verilog HDL.
SystemVerilog is both a digital modeling language and a digital verification language.
Simon Davidmann, one of the early pioneers in digital simulation, has written a
more detailed history on the origins of Verilog and SystemVerilog, which can be
found in the appendix of the book “SystemVerilog fo r Design, Second Edition
This books focus. The focus of this book is on the design aspects of SystemVerilog.
The author recommends “SystemVerilogfor Verification, Third Edition, as a com
panion to this book to learn about the verification aspects of the language.
This section defines the terminology commonly used to describe the levels of detail
in which hardware functionality can be modeled using SystemVerilog.
1.2.1 Abstraction
SystemVerilog is capable of modeling digital logic at many different levels of
detail, referred to as “abstraction levels”. Abstract means lack of detail. The more
abstract a digital model is, the less detail that model contains about the hardware that
it represents.
1. Stuart Sutherland, Simon Davidmann and Peter Flake, “System Verilogfo r Design, Second Edition”,
New York, NY: Springer 2016, 978-0-3873-3399-1.
2. Chris Spear and Greg Tumbush, “SystemVerilog for Verification, Third Edition”, New York, NY:
Springer 2012, 978-1-4614-0715-7.
Chapter 1: SystemVerilog Simulation and Synthesis 7
Figure 1-3 shows the main levels of modeling abstraction available in SystemVer
ilog.
Primitive Description
Primitive Description
b u fifl Tri-state buffer gate with 1 input, 1 output, and 1 active-high enable
n o tif0 Tri-state inverter gate with 1 input, 1 output, and 1 active-low enable
n o tif1 Tri-state inverter gate with 1 input, 1 output, and 1 active-high enable
SystemVerilog also provides a means for ASIC and FPGA library developers to add
to the built-in set of primitives by defining User-Defined Primitives (UDPs). UDPs
are defined in a table format, where each line in the table lists a set of input values and
the resulting output values. Both combinational logic and sequential logic (such as
flip-flops) primitives can be defined.
Figure 1-4 shows a gate-level circuit of a 1-bit adder with carry. Example 1-1
shows the SystemVerilog code that models this circuit using primitives.
Figure 1-4: 1-bit adder with carry, represented with logic gates
endmodule: gate_adder
Many of the gate-level primitives can have a variable number of inputs. For exam
ple, an and primitive can represent a 2-input, 3-input, or 4-input AND gate, as fol
lows:
and il (ol, a, b); // 2-input AND gate
and i2 (o2, a, b, c); // 3-input AND gate
and i3 (o3, a, b, c, d); // 4-input AND gate
The instance name for primitives is optional, but is good code documentation. It
makes it easier to maintain code and to relate SystemVerilog source code to schemat
ics or other representations of the design. The instance name is user-defined, and can
be any legal SystemVerilog name.
Gate level primitives can be modeled with propagation delays. If no delay is speci
fied, then a change on an input to the gate will be immediately reflected on the output
of the gate. The delay is an expression, that can be a simple value, as in instance g2 in
Example 1-1, or a more complex expression, as in instance g5. Gate g2 in the exam
ple above has a propagation delay of 1.3 nanoseconds, meaning that when there is a
transition on one of the gate inputs, it will be 1.3 nanoseconds before the output of the
gate, sum, will change. Gate g5 breaks the propagation delay into different delays for
rising and falling transitions on the output. If the value of co is transitioning from a 0
to 1, the change will be delayed by 1.5 nanoseconds. If co is transitioning from a 1 to
0, the change will be delayed by 1.8 nanoseconds.
Gate-level modeling can represent the propagation delays of actual silicon with a
high degree of accuracy. The functionality of the logic gates reflects the functionality
of the transistor combinations that would be used in silicon, and the gate delays can
reflect the propagation delays through those transistors. This accuracy is used by
ASIC and FPGA suppliers to model the detailed behavior of specific devices. Section
1.6 (page 31) in this chapter explores this usage of gate-level models in ASIC and
FPGA design flows.
Gate-level models are usually generated by software tools or engineers specializing
in library development. Design engineers designing at the RTL level seldom, if ever,
model with gate-level primitives. Rather, RTL designers use netlists of gate-level
models, where the netlists were generated by synthesizing the RTL models. The gate-
level models are provided by the vendor of the target ASIC or FPGA device. There is
much more to gate-level modeling than has been shown in this section, but this book
does not go into any further detail on this topic.
Switch-level modeling. SystemVerilog can also model digital circuits at the transis
tor level, using switch primitives (such as pmos, nmos and cmos), resistive switch
primitives (such as rpmos, rnmos and rcmos), and capacitive nets. This level of
modeling can closely represent actual silicon implementation. However, since these
constructs only model digital behavior, they are seldom used. Transistors, resistors
and capacitors are analog devices. Digital simulation does not accurately reflect tran
sistor behavior. Switch-level modeling is typically not used in ASIC and FPGA
design flows with SystemVerilog, and is not discussed in detail in this book.
10 RTL Modeling with SystemVerilog for Simulation and Synthesis
endmodule: rtl_adder
One advantage of RTL modeling is that the code is more self-documenting. It can
be difficult to look at the gate-level model in Example 1-1 and recognize what the
model represents, especially if there were no comments and meaningful names. On
the other hand, it is much easier to look at the code in the RTL model in Example 1-2
and recognize that the functionality is an adder.
Another powerful advantage of RTL modeling is the ability to work with vectors
and bundles of data. A vector is a signal that is more than one bit wide. The detailed
switch-level and gate-level of modeling operate a one 1-bit wide signals, which are
referred to as scalar signals in SystemVerilog. To model a 32-bit adder would require
modeling switches or gates that operated on each individual bit, the same as in actual
silicon. The continuous assignment statement in Example 1-2 above can model an
adder of any size, simply by changing the declarations of the signals.
More complex functionality can be modeled using procedural blocks. A procedural
block encapsulates one or more lines of programming statements, along with infor
mation about when the statements should be executed. There are four types of always
procedures that are used at the RTL level: always, always_comb, always_ff and
Chapter 1: SystemVerilog Simulation and Synthesis 11
always_latch. Chapter 6, section 6.1 (page 211) explores the use of always proce
dural blocks in greater detail.
The following example concisely represents a 32-bit adder/subtractor with regis
tered outputs:
endmodule: rtl_adder_subtractor
In a typical simulation and synthesis design flow, engineers will spend most of their
time modeling at the RTL level and verifying RTL functionality. The focus of this
book is on writing RTL models that will simulate and synthesize optimally.
A full definition of ASIC and FPGA technologies is beyond the scope of this book,
which is about proper digital logic modeling styles with the SystemVerilog language.
The purpose of this section is to look at how SystemVerilog modeling styles can be
affected by ASIC and FPGA technology. Details on ASIC and FPGA implementation
and the appropriate applications for these technologies are left to other engineering
books and discussions. In order to meet this objective on RTL modeling best prac
tices, however, it is important to understand the basic concepts of ASICs and FPGAs.
of the ICs. Other ASIC vendors provide the technology for the ASIC, but leave the
fabrication and production to some other source.
Most ASIC technologies use standard cells, which are pre-designed blocks of logic
consisting of one to several logic gates. An ASIC cell library might have a few hun
dred standard cells, such AND, NAND, OR, NOR, Exclusive-OR, Exclusive-NOR,
2-to-l MUX, D-type flip flop, latch, etc. Each cell will have well defined electrical
characteristics, such as propagation delays, setup and hold times, and capacitance.
Designing an ASIC involves selecting appropriate cells from the library and con
necting them together to perform the desired functionality. Software tools are used
throughout this process. The typical flow for ASIC design is shown in Figure 1-5.
steps of modeling, simulation, and synthesis. RTL coding styles can impact the effec
tiveness of the tools used later in the design flow.
There are other types of ASIC technologies that do not use standard cells, such as
full-custom, gate-array, and structured ASICs. SystemVerilog can be used in a similar
way to design these other types of ASICs, though the software tools involved might
be different. The synthesis compilers used — and the SystemVerilog language con-
sturcts supported by those compilers — can be very different with these other tech
nologies. This book focuses on modeling with SystemVerilog for the more general
standard cell ASIC technology.
1.4.2 FPGAs
FPGA is an acronym for Field Programmable Gate Array. An FPGA is an inte
grated circuit containing a fixed number of logic blocks that can be configured after
the IC is manufactured (whereas the contents and layout of an ASIC must be deter
mined prior to manufacturing). Historically, FPGAs could not contain as much func
tionality as an ASIC and ran at slower clock speeds, which were important
considerations when designing at the RTL level. Recent advancements in FPGA tech
nology have significantly narrowed the difference between FPGAs and ASICs. In
general, an FPGA can be used to implement the same functionality as an ASIC.
FPGAs contain an array of many small logic components referred to as Configu
rable Logic Blocks (CLBs). Some FPGA vendors refer to these blocks as Logic Array
Blocks (LABs). A typical CLB might contain one or more Look-up Tables (LUTs),
some multiplexers (MUXes), and a storage element such as a D-type flip flop. The
look-up tables in most FPGAs are small RAMs that are programmed with logic oper
ations such as AND, OR and XOR. Selecting a desired operation from the LUT
allows a CLB to be used in a variety of ways, from a simple AND or XOR gate to
much more complex combinational functionality. The CLBs in some FPGAs might
also have other functionality, such as an adder. A MUX allows the combinational
result to be directly output from the CLB (asynchronous output), or to be registered in
the storage element (synchronous output).
An FPGA will have been manufactured with an array containing many hundreds or
thousands of CLBs, along with configurable interconnections that can be “pro
grammed” to a desired configuration of CLBs. An FPGA also contains I/O pads,
which can be configured to connect to either one column or one row of the CLB array.
A typical design flow for a complex FPGA is shown in Figure 1-6.
16 RTL Modeling with SystemVerilog for Simulation and Synthesis
fro n t
end
back
end
The front end of the design flow for an FPGA is similar to that of an ASIC, but the
back end is different. The primary difference in the back-end portion of this FPGA
flow from the ASIC flow is the placement and routing of FPGAs. With ASICs, the
place and route software determines how the IC will be manufactured. With FPGAs,
the synthesis and place and route software determine how the pre-manufactured IC
will be programmed. This book is focused on the front-end steps 2 and 3, RTL model
ing and simulation, where there is very little difference between ASIC and FPGA
design.
The truth comes close to this ideal. Most, but not all, RTL code will synthesize
equally well for both ASICs and FPGAs. There are exceptions to this generality, how
ever. Some aspects of an RTL model need to take into consideration whether the
design will be implemented in an ASIC or an FPGA. These aspects include:
Resets. Most ASIC cell libraries include both synchronous and asynchronous reset
flip-flops. Design engineers can write RTL models using the reset type deemed best
for the design. Some FPGAs are not as flexible, and only have flip-flops with one
type of reset (typically synchronous). While a synthesis compiler can map RTL mod
els with asynchronous resets to gate-level synchronous resets, or vice versa, extra
logic gates will be required. Many FPGAs also support global reset functionality and
power-up flip-flop states, that ASICs do not have. Chapter 8, section 8.1.5 (page 286)
discusses modeling resets in more detail.
Vector sizes. ASICs are fairly unconstrained for maximum vector widths and vector
operations. Complex operations on large vectors will require a lot of logic gates, but
the standard cell architecture used in most ASICs can accommodate these operations.
FPGAs are more rigid in this regard. The predefined number of CLBs and their place
ment within an FPGA can limit the ability to implement complex operations on very
large vectors, either due to the number of CLBs available, or the complexity of rout
ing the interconnections between the CLBs. This difference between ASICs and
FPGAs does mean that, even at the RTL level of abstraction, the design engineer must
keep in mind the limits of the device to which the functionality will be targeted.
Most of the examples, coding styles, and guidelines presented in this book apply
equally to both ASIC and FPGA design. Specific mention is made for the rare excep
tions when targeting an ASIC or FPGA impacts RTL coding style.
Digital simulation is a software program that applies logic value changes, called
stimulus, to the inputs of a model of a digital circuit, propagates that stimulus through
the model in the same way in which actual silicon would propagate those logic value
changes, and provides mechanisms for observing and verifying the results of that
stimulus.
SystemVerilog is a digital simulation language that works with zeros and ones. The
language does not represent analog voltage, capacitance, and resistance. SystemVer
ilog provides programming constructs to model digital circuits, to model stimulus
generators, and to model verification checkers.
This book focuses strictly on the first of these aspects, modeling digital circuits, and
this topic will be discussed and illustrated in detail in subsequent chapters. Example
1.4 illustrates a simple digital circuit model that can be simulated. This is the same
circuit shown earlier in this chapter as Example 1.3 (page 12).
18 RTL Modeling with SystemVerilog for Simulation and Synthesis
Example 1-4: Design model with input and output ports (a 32-bit adder/subtractor)
module rtl_adder_subtractor
(input logic elk, // 1-bit scalar input
input logic mode, // 1-bit scalar input
input logic [31:0] a, b, // 32-bit vector inputs
output logic [31:0] sum // 32-bit vector output
);
endmodule: rtl_adder_subtractor
Observe in this example that the model has input ports and output ports. In order to
simulate this model, stimulus must be provided that applies logic values to the input
ports, and a response checker must be provided to observe the output ports. Although
not the focus of this book, a brief overview of stimulus and response checking is pro
vided here to show what is involved to simulate a SystemVerilog model.
A testbench is used to encapsulate the stimulus generation and response verifica
tion. There are many ways a testbench can be modeled in SystemVerilog, and the
code within the testbench can range from simple programming statements to elabo
rate object-oriented, transaction-level programming. Example 1-5 illustrates a simple
testbench for the 32-bit adder/subtractor design.
// generate stimulus
initial begin
repeat ( 10) begin
@(negedge elk) ;
void'(std::randomize(a) with [a >= 10; a <= 20;});
void'(std::randomize(b) with [b <= 10;});
void'(std::randomize(mode));
0 (negedge elk) check_results;
end
0 (negedge elk) $finish;
end
Chapter 1: SystemVerilog Simulation and Synthesis 19
// verify results
task check_results;
$display("At %0d: \t a=%0d b=%0d mode=%b sum=%0d",
$time, a, b, mode, sum);
case (mode)
1'bO: if (sum !== a + b)
$error("expected sum = %0d", a + b);
l'bl: if (sum !== a - b)
$error("expected sum = %0d", a - b);
endcase
endtask
endmodule: test
Initial and always procedures. The main block of code in Example 1-5 is an in itia l
p r o c e d u r e , which is a type of p r o c e d u r a l b lo cks. Procedural blocks contain program
ming statements and timing information to instruct simulators what to do, and when
to do it. SystemVerilog has two primary types of procedural blocks, in itia l p r o c e d u r e s
and a lw a y s p r o c e d u re s .
Initial procedures are defined with the keyword initial. An initial procedure,
despite its name, is not used for initializing designs. Rather, an initial procedure exe
cutes its programming statements one time. When the last statement is reached, the
initial procedure is not executed again for a given run of a simulation. Initial proce
dures are not synthesizable, and are not used for RTL modeling. This book focuses on
writing RTL models for simulation and synthesis, and therefore does not discuss ini
tial procedures in any more depth.
Always procedures are defined with the keywords, always, always_comb,
always_ff and always_latch. An always procedure is an infinite loop. When the
procedure has completed execution of the last statement in the procedure, the proce
dure automatically returns to the beginning, and starts the procedure again. For RTL
modeling, an always procedure must begin with a sensitivity list, such as the
0 (posedge elk) definition shown in Example 1-4 (page 18). The various forms of
always procedures are discussed in more detail in Chapters 6 through 9.
A procedural block can contain a single statement, or a group of statements. Multi
ple statements in a procedural block are grouped together between the keywords
begin and end (verification code can also group statements between the keywords
fork and join, join_any or join_none). Statements between begin and end are
executed in the order in which they are listed, i.e.: beginning with the first statement
and ending with the last statement.
The initial procedure in Example 1-5 contains a repeat loop. This loop is defined
to execute 10 times. Each pass of the loop:
1. Delays to the negative edge of the e lk signal.
2. Generates random values for the a, b and mode inputs of the design.
20 RTL Modeling with SystemVerilog for Simulation and Synthesis
3. Delays until the next negative edge of e lk , and then calls a c h e c k _ r e s u its
task (a subroutine) to verify that the output of the design matches a calculated
expected result.
The design operates on the positive edge of its clock input. The testbench uses the
opposite edge of this same clock, in order to avoid driving the inputs and reading the
outputs of the design on the clock edge that is used by the design. If the testbench
drove values on the positive edge of the clock, there would be zero setup time for
those inputs to become stable before the design used the inputs. Similarly, if the test-
bench verified the design results on the positive edge of clock, there would be zero
time for those design outputs to become stable.
Modifying and reading a value at the same instant of time is referred to as a s im u la
tion r a c e c o n d itio n . Using the opposite edge of the design clock to drive stimulus is a
simple way for a testbench to avoid simulation race conditions with the design, such
as meeting design setup and hold time requirements. SystemVerilog provides much
more effective ways for a testbench to avoid race conditions with the design being
tested, which are beyond the scope of this book. The author recommends the book
System Verilog fo r Verification by Chris Spear and Greg Tumbush for more details
on SystemVerilog’s versatile and powerful verification constructs, and proper verifi
cation coding styles.
The testbench is modeled as a module with input and output ports, similar to the
design being verified. The last step is to connect the testbench ports to the design
ports, and generate the clock. This is done in a top-level module. Example 1-6 shows
the code for this.
logic [31:0] a, b;
logic mode;
logic [31:0] sum;
logic elk;
initial begin
elk <= 0;
forever #5 elk = ~clk;
end
endmodule: top
3. “System Verilog for Verification, Third Edition ”, Chris Spear and Greg Tumbush. Copyright 2012,
Springer, New York, NY. ISBN: 978-1-4614-0715-7.
Chapter 1: SystemVerilog Simulation and Synthesis 21
Not all declarations are order dependent. For example, SystemVerilog allows mod
ule names to be referenced before the module is compiled. Within a module, tasks and
functions can be called before being defined, as long as the definition is within the
module.
endmodule: adder
Compiler directives, such as 'timescale, are not fully global. They only affect
source code that is compiled after the directive is read in, and in the same invocation
of the compiler. Any source code that was compiled before the directive was encoun
tered is not affected by the directive. Likewise, any code that is compiled as a separate
single-file compilation is not affected by the directive. The ' timescale directive
can affect multiple files when more than one file is compiled at the same time. The
Chapter 1: SystemVerilog Simulation and Synthesis 25
order in which multiple files are compiled becomes critical when only some of the
files have a ' timescale directive. Compiling the files in a different order can lead to
different simulation results because of the semi-global behavior of ' timescale.
The timeunit and timeprecision keywords are local to the module in which
they are used, and do not have the compilation order side effects of compiler direc
tives.
Simulation-level time units. The IEEE SystemVerilog standard does not specify a
default time unit or precision that simulators should use if no timeunit and time-
precision statement has been specified, and no ' timescale compiler directive is
in effect. Different simulators have different behaviors in this circumstance, ranging
from making it a compilation error, to having an implied default unit and precision.
All simulators provide an invocation option to set the default time unit and time
precision to be used in any module that does not have this information declared within
the module or by a ' timescale compiler directive. An example invocation option
that works with most simulators is:
-timescale=lns/lnps
Refer to the documentation of the specific simulator for the command-line invoca
tion option that sets the simulation time units and precision.
N O TE
Most code examples in this book do not include timeunit and timepre
cision statements. These statements were omitted in order to focus on the
RTL code in each example.
The example files for this book that can be downloaded (see page xxx in the
Preface) have a timeunit and timeprecision statement in every module,
interface or package, even when there are no delays within the model. This
ensures that the models will compile with the same time units on all simula
tors.
ies, with no propagation delay within a clock cycle. Propagation delays are frequently
used in verification testbenches.
The # token is used to represent a delay. The following code snippet shows the use
of a delay to model a clock oscillator in a testbench.
always #5 elk = ~clk;
A specific time unit can be specified with a delay. This is particularly useful when a
delay requires a different time unit than the units of the module.
always #5ns elk = ~clk;
In practice, SystemVerilog simulators will optimize the simulation time line, and
only create time slots for the times in which actual simulation activity will occur. This
optimization means that the time precision used in the source code has little or no real
impact on simulation run-time performance.
Chapter 1: SystemVerilog Simulation and Synthesis 27
1.5.3.5 Active and NBA Update events, blocking and nonblocking assignments
SystemVerilog divides a simulation time slot into several event regions, which are
processed in a controlled order. This provides design and verification engineers some
control over the order in which events are processed. Most of the event regions are for
verification purposes, and are not discussed in this book. RTL and gate-level models
primarily use two of these event regions: the Active event region and the NBA Update
event region. The relationship of these two regions is shown in Figure 1-8.
There are several important details to understand regarding how event regions are
processed during simulation:
• Events in a begin-end statement group are scheduled into an event region in the
order in which the statements are listed, and are executed in that order when the
event region is processed.
• Events from concurrent processes, such as multiple always procedures, are sched
uled into the event region in an arbitrary order chosen by the simulator. The RTL
designer has no control over the order in which concurrent events are scheduled,
but the designer does have control regarding the region in which events are sched
uled.
• Simulators will execute all scheduled events in a region before transitioning to the
next region. As events are processed, they are permanently removed from the event
list. Each region will be empty before simulation proceeds to the next region.
• As events are processed in a later region, they can possibly schedule new events in
a previous region. After the later region has been processed, simulation will cycle
back through the event regions to process any newly scheduled events. The itera
tions through all event regions will continue until all regions are empty (i.e.: no new
events for that moment of simulation time are being scheduled).
• The transition from one event region to the next is referred to as a delta. Each itera
tion through all event regions is referred to as a delta cycle.
28 RTL Modeling with SystemVerilog for Simulation and Synthesis
NO TE
Proper usage of these two event regions is critical in order to obtain correct
RTL simulation results.
Example 1-7: A clock oscillator, stimulus and flip flop to illustrate event scheduling
7777777777777777777777777777777777777777777777777777777777777
// Design module with RTL model of a D-type register
/////////////////////////////////////////////////////////////
module d_reg (input logic elk,
input logic [7:0] d,
output logic [7:0] q
);
timeunit Ins; timeprecision Ins;
endmodule: d r e g
/////////////////////////////////////////////////////////////
// Top-level verification module with clock oscillator
/////////////////////////////////////////////////////////////
module top;
timeunit Ins; timeprecision Ins;
logic elk;
logic [7:0] d;
logic [7:0] q;
initial begin
d = 1;
#7 d = 2;
#10 d = 3;
#10 $finish;
end
endmodule: test
For the clock initialization in the module t o p , the simulator will schedule an Active
event to evaluate the right-hand side of the assignment to c l o c k at 0 nanoseconds,
and a Nonblocking update assignment to change the value of e l k at 0 nanoseconds.
Active events are also scheduled to change c l o c k at 5 nanoseconds, 10 nanoseconds,
and so forth.
For the test stimulus, the simulator will schedule Active events on d at times 0
nanoseconds, 7 nanoseconds, and 17 nanoseconds (7 + 10), and schedule a Sfinish
command at time 27 nanoseconds (7 + 10+10).
For the RTL flip flop, the simulator will schedule Active events to evaluate d at
times 5 nanoseconds, 15 nanoseconds and 25 nanoseconds (the positive edges of
e l k ) , and NBA Update events to update q at times 5 nanoseconds, 15 nanoseconds
and 25 nanoseconds.
Figure 1-9 illustrates what the simulation time line might look like when Example
1-7 is simulated. This diagram shows the two-step process for nonblocking assign
ments as an Active event that assigns to a internal temporary variable that is the prime
of the signal name, and an NBA update event that assigns the internal temporary vari
able to the actual signal.
Figure 1-9: Simulation time line and time slots with some events scheduled
Note that Figure 1-9 is a conceptual illustration of simulation behavior. It does not
reflect how simulators implement this behavior in the simulator’s internal algorithms.
The full event scheduling algorithm is much more complex than shown in this dia
gram, and is described in full in the IEEE 1800 SystemVerilog standard.
Chapter 1: SystemVerilog Simulation and Synthesis 31
2. The technology library fo r the target ASIC or FPGA — this library is provided
by the ASIC or FPGA vendor, and contains the definitions of the standard cells
(for ASIC) or gate-array blocks (for FPGA) that are available to implement the
desired functionality.
3. Synthesis constraint definitions — these constraints are defined by the design
engineer, and provide the synthesis compiler information that is not available in
the RTL code, such as the desired clock speed, area and power goals that need to
be realized in the ASIC or FPGA.
For front-end design and verification purposes, the primary output of synthesis is a
gate-level netlist. A netlist is a list of components and the wires (called nets) that con
nect these components together. The components referenced in the netlist will be
ASIC standard cells or FPGA blocks that were used to implement the desired func
tionality. This netlist can be in a number of formats, including EDIF, VHDL, Verilog-
2001 or SystemVerilog. Only the SystemVerilog output will be used in this book.
Simulation models of each component are required in order to simulate a System
Verilog netlist. A simulation library written in SystemVerilog will be provided by the
target ASIC or FPGA vendor. Often, these libraries only use the Verilog-2001 subset
of SystemVerilog. These components are modeled at the gate-level with detailed
propagation delays. The models are very different than the abstract RTL models writ
ten by design engineers. This book does not examine this low-level of modeling.
Synthesis compiler limitations. The synthesizable RTL models shown in this book
reflect the subset of the SystemVerilog language that is supported by most major syn
thesis compilers.
SystemVerilog is a dual purpose language. One purpose is to model the behavior of
digital hardware. A second purpose is to code verification programs to test the hard
ware models. These two purposes have very different language requirements. Many
general purpose programming constructs are useful for both purposes, such as and if-
else decision or a for-loop. Other language features are intended strictly for verifica
tion, such as constrained random test generation. These verification constructs do not
represent hardware functionality, and are not intended to be supported by synthesis
compilers.
The IEEE has not identified an official synthesizable subset for SystemVerilog.
This shortcoming of the standard has led to important deviations in what each synthe-
Chapter 1: SystemVerilog Simulation and Synthesis 33
order. If the same package is used by several sub blocks, the package will need to be
recompiled with each sub block that is compiled separate from other sub blocks. The
second consideration is that any global declarations, including 'define compiler
directives, will not be seen in each separate compilation. This same problem exists
with simulators that support single-file compilation, and the same guideline discussed
in section 1.5.2 (page 21) applies — avoid the use of global declarations and defini
tions. Single-file and multi-file compilation do not see the same global space.
1.6.3 Constraints
Figure 1-10 on page 31 showed that one of the three primary inputs into synthesis is
constraint definitions. Constraints are used to define information that synthesis needs,
but which is neither in the RTL models nor the ASIC/FPGA vendor’s technology
library. Figure 1-11 illustrates a simple circuit where some of the information that
synthesis requires must be specified by the design engineer, using synthesis con
straints.
The process of synthesizing this functional flow of data into logic gates involves:
• Mapping the inferred flip-flop ffi to an appropriate flip-flop in the target ASIC or
FPGA.
• Mapping the functionality described in logic_block_l to the standard cells or
logic blocks of the target ASIC or FPGA.
• Optimizing the implementation of logic_block_l to meet the setup and hold
requirements of f f i .
• Mapping the functionality described in logic_block_2 to the standard cells or
logic blocks of the target ASIC or FPGA.
• Optimizing the implementation of logic_block_2 to meet the output arrival
requirements of the design specification.
Chapter 1: SystemVerilog Simulation and Synthesis 35
In order to realize the simple circuit shown in Figure 1-11 in a target ASIC or
FPGA, synthesis compilers must know:
1. The propagation delays, area and power requirements of the standard cells or
logic blocks used to implement logic_block_l and logic_block_2.
2. The setup and hold times of f f i .
3. The period or frequency of elk, such as lOOMhz.
4. The arrival time of ini relative to the active edge of elk.
5. The drive capability of the external source for ini.
6. The arrival time of outl relative to the active edge of elk.
7. The output drive requirement for outl.
This information will not be in the RTL model. The specifications for the first two
items in this list, propagation delays and setup/hold times, will come from the tech
nology library provided by the ASIC or FPGA vendor. The remaining details must be
specified by the design engineer who is synthesizing the design. These specifications
are referred to as synthesis constraints. A larger, more complex design will require
many more synthesis constraints. The RTL coding examples in this book will discuss
applicable synthesis constraints, where appropriate. Guidelines are also provided for
simplifying the constraints that must be specified.
The way in which synthesis constraints are specified varies with different synthesis
compilers. This book does not discuss how constraints are specified for any specific
synthesis compiler. Readers are encouraged to refer to the product documentation for
this information.
The book Constraining Designs fo r Synthesis and Timing Analysis ” 4 is a good
“
source for more information on specifying timing constraints for synthesis, as well as
for static timing analysis.
Synthesizable RTL coding rules impose a number of restrictions on how the Sys
temVerilog language can be used. It can be frustrating to invest many hours of time
writing a SystemVerilog model that appears to follow synthesis coding rules and sim
ulates correctly, only to find out it will not synthesize on a specific synthesis compiler.
It is possible to periodically run the RTL code through a synthesis compiler as the
RTL functionality is being developed, in order to ensure that the final RTL code will
be synthesizable. Synthesis compilers are expensive software tools, however, and
many companies have only a limited number of synthesis licenses. It is not practical
4. “Constraining Designs for Synthesis and Timing Analysis ”, Sridhar Gangadharan and Sanjay Churi-
wala. Copyright 2013, Springer, New York, NY. ISBN 978-1-4614-3268-5.
36 RTL Modeling with SystemVerilog for Simulation and Synthesis
to tie up a synthesis license in order to periodically check that RTL code adheres to
synthesis coding rules while that code is being developed.
Lint checkers can be used in place of a synthesis compiler to check that RTL code
meets synthesizable RTL coding rules. A SystemVerilog lint checker is a software
tool that parses SystemVerilog source code and checks it against specific coding
rules. Lint checkers are configurable. Engineers can enable or disable specific checks,
and often add checks for guidelines or requirements at a specific company or on a
specific project.
There are a number of commercial lint checkers for SystemVerilog. This book does
not go into details on running any specific lint checker. Readers are encouraged to use
a lint checker of their choice when trying out the examples in this book or other Sys
temVerilog RTL models.
A Logic Equivalence Checker (LEC) is an engineering tool that analyzes the func
tionality of two models to determine if the models are logically the same. LEC tools
can compare two versions of an RTL model, two versions of a gate-level model. The
most common application of logic equivalence checking, however, to is compare the
functionality of an RTL model with the post-synthesis gate-level gate-level model.
This type of checking is often done after Engineering Change Orders (ECOs) or other
types of changes have been made to the gate-level design that was created during the
synthesis process.
Logic equivalence checking is one type of formal verification, and is sometimes
referred to as formal verification. The term formal verification, however, is a more
general term that can also encompass other types of engineering tools. Another type
of formal verification is the use of SystemVerilog Assertions (SVA) to describe
design behavior that should, or should not, occur.
The use of assertions in design and verification is not discussed in this book. The
author of this book is a proponent on the use of SystemVerilog in design code, and
having design engineers add assertions into RTL models as part of writing the RTL
code. More information on this topic can be found in the white paper, “Who Put
Assertions In My RTL Code? And Why? How RTL Design Engineers Can Benefit
from the Use o f SystemVerilog Assertions”^.5
5. “Who Put Assertions In My RTL Code? And Why? How RTL Design Engineers Can Benefit from the
Use o f SystemVerilog Assertions”, Stuart Sutherland. Presented at the 2013 Silicon Valley Synopsys Users
Group Conference (SNUG). Available for download at sutherland-hdl.com.
Chapter 1: SystemVerilog Simulation and Synthesis 37
1.9 Summary
Chapter 2
RTL Modeling Fundamentals
Abstract — This chapter covers the general modeling constructs used for writing
Register Transfer Level (RTL) models in SystemVerilog. The topics presented in this
section include:
• Modules and procedural blocks
• System Verilog language rules (comments, reserved keywords, user-defined names)
• Best practices for naming conventions
• System tasks and functions
• Compiler directives
• Module declarations
• Module port declarations
• Hierarchical models
• Module instances
• Port connections
• Netlists
Modules are the primary modeling block in System Verilog. Modules are declared
with the keyword module, and are completed with the keyword endmodule. A mod
ule is a container that holds information about a model, including input and outputs to
the module, data type declarations, and executable code. Section 2.3 (page 52) dis
cusses module declarations in more detail.
Executable code within a module is contained in procedural blocks. System Verilog
has two primary types of procedural blocks, initial procedures, defined with the key
word initial, and always procedures, defined with the keywords, always,
always_comb, always_ff and always_latch. Initial procedures are not synthe-
sizable, and are not used for RTL modeling. Always procedures are infinite loops.
When the procedure has completed execution of the last statement in the procedure,
the procedure automatically returns to the beginning, and starts the procedure again.
40 RTL Modeling with SystemVerilog for Simulation and Synthesis
For RTL modeling, an always procedure begins with a sensitivity list, such as
0 (posedge clock). The various forms of always procedures, and how they are
used for RTL modeling, are discussed in more detail in Chapters 6 through 9.
A procedural block can contain a single statement, or a group of statements. Multi
ple statements in a procedural block are grouped together between the keywords
begin and end. Statements between begin and end are executed in the order in
which they are listed.
2.2.1 Comments
SystemVerilog has two types of general comments: one-line and block. There are
also two special types of comments: pragmas and attributes.
One-line comments begin with a / / token, and are terminated with a new line. The
comment can begin anywhere on a line, and comments out the rest of that line.
According to the SystemVerilog standard, the / / token, and all text following it up to
the new line, are ignored. Synthesis compilers, however, do not fully adhere to this
rule. Synthesis compilers hide synthesis-specific commands within a form of com
ment called a pragma, which is discussed in on the following page.
Chapter 2: RTL Modeling Fundamentals 41
Block comments begin with a / * token, and are terminated with a * / token. New
lines between the token are ignored, allowing the comment to span any number of
lines. Block comments cannot be nested. Once a /* starting token is encountered, a
parser will ignore all text, including another /* , until an ending * / token is encoun
tered. A nested /* is not seen as the start of a nested comment.
Block comments are useful for temporarily commenting out a section of code when
debugging a model. Since block comments cannot be nested, it is difficult to tempo
rarily comment out code sections if block comments have been used within that sec
tion. Block comments are okay for model headers.
Example 2-1 shows the use of both one-line and block comments.
* Specification:
* Performs unsigned 32-bit arithmetic, with no overflow or
* underflow. A mode control selects whether the operation is
* an add or a subtract.
* - Add when mode is low
* - Subtract when mode is high
* The output is registered.
* The register has an active low, asynchronous reset.
*
*/
42 RTL Modeling with SystemVerilog for Simulation and Synthesis
Attributes begin with a (* token, and are terminated with a * ) token. Attributes
must be on a single line, and cannot contain a new line. An attribute is a special type
of comment that contains information for specific software tools. A synthesis attri
bute, for example, will be ignored by simulators, but will be read by synthesis compil
ers. An attribute comment is associated with a specific language construct. For
example, a module can have an attribute associated with it, and a programming state
ment can have an attribute associated with it. Attributes are used extensively by
mixed analog/digital simulators. An old, obsolete Verilog synthesis standard,
IEEE 1364.1, defined several synthesis attributes. These attributes are seldom used,
and are not discussed in this book.
Pragmas are a special form of a one-line or block comment. A pragma comment
begins with a word that identifies the rest of the comment as containing information
for specific software tools. Synthesis compilers make extensive use of pragmas to
provide information that aids the synthesis process, but which is irrelevant to simula
tion. Comments that begin with the word synthesis or pragma are recognized as a
pragma by nearly all commercial synthesis compilers.
An example of using synthesis pragmas is:
always_comb begin
if (select == 0) y = a;
// synthesis translate_off
else if ($isunknown(select))
$warning("select has incorrect value at %t", $realtime);
// synthesis translate_on
else y = b;
Another example synthesis pragma is:
case (mode) // synthesis full case
Chapter 2: RTL Modeling Fundamentals 43
NOTE
At the time this book was written, one commercial synthesis compiler did
not recognize / / synthesis as a synthesis pragma. That compiler required
that pragmas start with / / pragma or / / synopsys.
Some synthesis compilers might also recognize other words within a comment as
defining a synthesis pragma. These other words are tool-specific, and the comment
might not be treated as a pragma by every synthesis compiler.
Synthesis pragmas are important constructs for writing synthesizable RTL models.
Other synthesis pragmas will be discussed throughout this book.
NOTE
Care must be taken when using synthesis pragmas. Simulators ignore these
comments, and not simulate the effects of the pragma. This means that the
code that is simulated and verified for correct functionality might not be the
same behavior implemented by a synthesis compiler.
Coding guidelines on the proper usage of specific pragmas are presented as those
pragmas are discussed in this b o o k ,.
and maintainable. Example 2-2 illustrates a simple RTL 32-bit adder with minimal
white space. Example 2-3 shows the same RTL adder, but with added white space.
Although both examples are syntactically correct and functionally identical, it should
be obvious that the second example is a more readable coding style.
Example 2-3: SystemVerilog RTL model with good use of white space
module rtl_adder_good_style
( input logic [31:0] a, b,
output logic [31:0] sum,
output logic co
);
always_comb begin
[co, sum} = a + b;
end
endmodule: rtl_adder_good_style
Spaces versus tabs. Example 2-3 uses white space to indent code. One level of
indentation is used between the module...endmodule block. A second level of inden
tation is used for the code between the begin...end block. Indentation helps make it
easier to see what code is within each block. Either the space character or a tab can be
used to indent code. Programmers are often passionate about whether spaces or tabs
should be used for indentation. In practice, either style works fine. It is helpful, how
ever, if all members of a design team use one style or the other. Consistent use of
either spaces or tabs within a project makes it easier for engineers to read and main
tain code developed by other engineers on the project.
The author of this book prefers using spaces for indentation, and specifically two
spaces for each level of indentation. By using spaces, code will indent the same
amount on all editors, regardless of what the tab settings might be. Two spaces is
enough to see each level of indentation, and yet does not indent so far over that code
does not fit well on each line.
Table 2-1 lists the reserved keywords in the 1800-2012 SystemVerilog standard.
Table 2-1: SystemVerilog-2012 reserved keywords
a c c e p t on endchecker inside pullup sync accept-
alias endclass instance pulsestyle- on
always endclocking int on-detect sync reject-
always comb endconfig integer pulsestyle- on
always ff endfunction interconnect on-event table
always latch endgenerate interface pure tagged
and endgroup intersect rand task
assert endinterface join randc this
assign endmodule join any randcase throughout
assume endpackage join none randsequence time
automatic endprimi tive large rcmos timeprecision
before endprogram let real timeunit
begin endproperty liblist realtime tran
bind endspecify library ref tranif0
bins endsequence local reg tranif1
binsof endtable localparam reject on tri
bit endtask logic release triO
break enum longint repeat tril
buf event macromodule restrict triand
bufifO eventually matches return trior
bufif1 expect medium rnmos trireg
byte export modport rpmos type
case extends module rtran typedef
casex extern nand rtranifO union
casez final negedge rtranifl unique
cell first match nettype s always uniqueO
chandle for new s eventually unsigned
checker force nexttime s nexttime until
class foreach nmos s until until with
clocking forever nor s until with untyped
cmos fork noshowcan- scalared use
config forkjoin celled sequence uwire
const function not shortint var
constraint generate notifO shortreal vectored
context genvar notifl showcancelled virtual
continue global null signed void
cover highzO or small wait
covergroup highzl output soft wait order
coverpoint if package solve wand
cross iff packed specify weak
deassign ifnone parameter specparam weakO
default ignore bins pmos static weakl
defparam illegal bins posedge string while
design implements primitive strong wildcard
disable implies priority strongO wire
dist import program strongl with
do incdir property struct within
edge include protected super wor
else initial pullO supplyO xnor
end inout pulll supplyl xor
endcase input pulldown
(Some keywords in this table have been hyphenated in order to fit the format of this book. The actual key
words do not contain hyphens.)
46 RTL Modeling with SystemVerilog for Simulation and Synthesis
Using these directives documents which version of SystemVerilog was in use when
the code was developed, and helps ensure that the code will be compatible with cur
rent and future versions of SystemVerilog.
Chapter 2: RTL Modeling Fundamentals 47
NOTE
SystemVerilog compiler directives are not bound by files. A
'begin_keyword directive in one file will affect all subsequent files that
are read in by the compiler for that invocation of the compiler. This can
result in file order dependencies and different behavior when files are com
piled separately instead of together.
To avoid these side effects, every 'begin_keyword directive should be
paired with a 'end_keywords directive in the same file.
Example 2-4 and Example 2-5 illustrate using the 'begin_keyword and
'end_keywords directives in order to mix a legacy Verilog model with a newer Sys
temVerilog model. Example 2-4 is written using the older Verilog-2001 standard. The
code uses priority as a user-defined name, which was legal in Verilog. Example 2-
5 is written using SystemVerilog, where priority is a reserved keyword. By using
the 'begin_keyword directives, a SystemVerilog compliant simulator or synthesis
compiler can read in both models, either together or separately, even though Example
2-4 is not compatible with SystemVerilog’s reserved keyword set.
Note — Examples 2-4 and 2-5 are functionally equivalent, but use different pro
gramming constructs. Chapter 6 discusses the programming constructs used in these
examples, and Chapter 7 discusses how these two modeling styles can affect synthesis
results.
NOTE
The code examples shown in this book do not include the
'begin_keywords and 'end_keywords directives, but the downloadable
examples files do have these directives. They are omitted from the book to
save space and to focus on the concepts being illustrated in each example.
Using .v and .sv file names to distinguish Verilog and SystemVerilog code. The
IEEE SystemVerilog standard does not require files be named with any specific file
extensions. However, simulators and synthesis compilers use the file name extension
to determine which reserved keyword list should be used by a compiler. Files ending
with .v are assumed to be written using an older Verilog reserved keyword list. Files
ending with . sv are assumed to be written using a newer SystemVerilog reserved
keyword list.
A problem with file extensions is that they do not indicate which version of Verilog
or SystemVerilog to use. Simulators and synthesis compilers that use file extensions
can end up assuming different versions. This means a file ending with .v or . sv
might work correctly on one software tool, and have a keyword conflict with another
Chapter 2: RTL Modeling Fundamentals 49
software tool. Using the 'begin_keyords and ' end_keywords directives works
with all SystemVerilog compliant software tools, and is preferred over relying on file
extensions.
Case sensitivity. System Verilog is case sensitive, meaning lower-case and upper
case characters are treated as being different. The identifiers input and i n p u t are
different names, and are not the same as the reserved keyword input (all reserved
keywords in System Verilog are in lower-case).
Escaped names . Characters that are normally illegal can be used in an identifier by
escaping the identifier. An escaped name can use any printable ASCII character, can
start with a number, and can be a reserved keyword. Escaped identifiers begin with a
backslash ( \ ) and are terminated by a white space. All characters following the
backslash and up to a white space are escaped, including characters that would nor
mally separate names, such as commas, parentheses, and semicolons. Some examples
of escaped names are:
\741s74 \reset- \~enable \module
The backslash is part of the name. Every place that an escaped identifier is refer
enced must include the backslash at the start of the name, and terminate with a white
space after the name.
Name spaces. Identifiers are local to the name space in which they are declared. The
constructs in SystemVerilog that introduce a local name space are:
• Definitions name space — a global space that can contain declarations of module,
primitive, program, and interface identifiers.
• Package name space — a global space that can contain declarations of package
identifiers.
50 RTL Modeling with SystemVerilog for Simulation and Synthesis
• Component name space — a local name space introduced by the keyword module,
interface, package, program, checker, and primitive. The component
name space can contain declarations of tasks, functions, checkers, instance names,
named begin-end and fork-join blocks, parameter constants, named events, nets,
variables, and user-defined types. A module can also contain nested declarations of
modules and programs. The identifier of a nested declaration is local to the module,
and is not in the definitions name space.
• $unit compilation unit name space — a pseudo-global space that can contain decla
rations of tasks, functions, checkers, parameter constants, named events, nets, vari
ables, and user-defined types. Declarations made outside of the component name
space are in the $unit space. Multiple $unit names spaces can exist at the same
time. Declarations in one $unit space are not shared with other $unit spaces.
• Block name space — a local name space that is introduced by tasks, functions, and
named or unnamed begin-end and fork-join blocks. A block name space can con
tain declarations of named blocks, named events, variables, and user-defined types.
• Class name space — a local name space introduced by the keyword class. The
class name space can contain declarations of variables, tasks, functions, and nested
class declarations.
The same identifier cannot be declared twice in the same name space. However, it
is legal for the same name space to declare an identifier and to reference an identifier
of the same name that was declared in a different space. For example, when instantiat
ing a module, the same name can be used for the module name and its instance name.
a suffix of _clock, _clk or _ck. All active-low signals might be declared with a pre
fix of n_, or a suffix of _n.
Consistency within a project for other aspects of modeling is also beneficial. This
includes what goes into the comment section at the beginning of each model, code
indentation, and the order of input and output ports.
The author of this book does not promote or encourage using one convention over
another. What the author does promote is consistency. The consistent naming conven
tion used in this book is:
• Clocks are named clock or elk, or have _clk appended to the name.
• Active-high resets are named reset or rst, Active-low resets are named resetN
or rstN. Active-high sets are named set or preset. Active-low sets are named
setN orpresetN.
• Other active-low signals have a capital n appended to the name.
• Other user-defined types have _ t appended to the name.
• Constants are in all capital letters.
• Package names have _pkg appended to the name.
2.3 M odules
endmodule: addbit
Module name. A module is enclosed between the keywords module and endmod
ule. Each module has a name, which is a user-defined identifier that must adhere to
the naming rules defined in section 2.2.5 (page 49). Optionally, the same name can be
specified after the endmodule keyword, separated by a colon. The ending name must
be identical to the module name. Specifying an ending name can help with code doc
umentation and maintenance in large, complex models, where there might be many
lines of code between the module and endmodule keywords.
Parameter list. Following the name of the module is an optional parameter list,
enclosed between the tokens # ( and ). Parameters are used to make modules configu
rable. The declaration and use of parameters is discussed in Chapter 3, section 3.8
(page 93).
Port declarations. Modules can have any number of ports, including none. A port is
used to pass data into or out of a module. Ports have a direction, type, data type, size,
and name. The direction is declared with the keywords input, output, or inout
(bidirectional). System Verilog provides an extensive set of built-in types and data
types, as well as user-defined types, which can be used for a port type and data type.
The various types and data types that are used in synthesizable RTL modeling are dis
cussed in detail in Chapter 3. Syntactically, the size of a port can range from 1-bit
wide to 2 16 (65536) bits wide. In practice, engineers must consider the limitations of
the ASIC or FPGA technology that will be used to implement the design. For exam
ple, some devices might be able to handle a 64-bit wide data bus, while other devices
might only support a maximum of a 32-bit wide data bus.
Functionality. At the heart of each module is the functional description of the silicon
the module represents. This functionality can be described at a very detailed level, or
a very abstract level. The concept of model abstraction was introduced in section 1.2
(page 6) of Chapter 1. This book is focused on representing module functionality at
the synthesizable RTL abstraction, and is the topic of subsequent chapters.
Timing. RTL modeling abstracts away the timing details of silicon. Synthesizable
RTL dictates that timing is on clock cycle boundaries. RTL models are often referred
to as zero-delay models, because there is no timing detail within a clock cycle.
Complex designs are partitioned into smaller blocks that are connected together.
Each sub block is represented as a module.
Figure 2-2 illustrates a simple design where the processor module has been parti
tioned into the sub blocks of a controller module, a 32-bit wide mux32 module, an
alu module, a status reg module, and two 32-bit wide reg32 modules.
Netlists and module instances. The SystemVerilog code for the processor mod
ule illustrated in Figure 2-2 is a netlist that contains instances of the controller,
mux32, alu, status_reg and two reg32 modules. A netlist is a list of one or more
module instances, and the nets (wires) that connect the instances together. A module
instance is a reference to the name of a module. The partial code for this netlist is:
module processor ( /* port declarations */ );
Then the instances of the two reg32 models in module processor would be:
NOTE
Port order connections are error prone. A simple coding mistake of listing a
connection in the wrong order can result in design bugs that are difficult to
debug, resulting in lost engineering time. Port order connections also make it
difficult to see which signals are connected to which ports.
Port order connections are not used in this book, and are only introduced here to
contrast port order connections to named port connections.
Using named port connections can help prevent accidental connection errors, as
well as making code more self-documenting. Named port connections associate a port
name with a local signal name connected to that port. There are three forms of named
port connections:
• Explicit named connection — shown in this section of the book.
• Dot-name connection shortcut — shown in section 2.4.3 (page 57).
• Dot-star connection shortcut — shown in section 2.4.4 (page 58).
In an explicit named connection, the name of the port is preceded by a period, and
the local signal name is enclosed in parentheses. For example:
reg32 b _ re g ( . l o a d ( l o a d _ b ) , . d ( a _ d a t a ) , . e l k ( e l k ) ,
. s e t N ( ) , . r s t N ( r s t N ) , . q ( b _ d a t a ) , . qb() );
reg32 d _ re g ( . e l k ( e l k ) , . l o a d ( l o a d _ d ) , . s e t N ( s e t N ) ,
. r s t N ( r s t N ) , . d ( i _ d a t a ) , . q ( d a t a _ o u t ) );
Some important considerations to note with named port order connections are:
• Port connections can be listed in any order. Instance d _ reg, above, does not list the
port connections in the same order as the port definitions within module reg32.
(shown previously, in section 2.4.1, page 55).
Chapter 2: RTL Modeling Fundamentals 57
• Unused ports can be explicitly listed, but with no local signal name in the parenthe
ses, as shown for the connection to the qb port in instance b_reg, or unused ports
can be left out of the connection list, as shown in instance d_reg.
• The code is self-documenting. It is visually apparent which nets are connected to
which ports of the reg32 module, without having the file containing the source
code for reg32 open or available.
nections over port order connections, and adds the advantage of a netlist that is more
concise, easier to read, and easier to maintain.
Dot-name shortcut rules. The dot-name named connection shortcut requires the fol
lowing condition be met in order to infer a connection between a named port and a net
or variable:
• A net or variable with a name that exactly matches the port name must be declared
prior to the module instance.
• The net or variable vector size must exactly match the port vector size.
• The data types on each side of the port must be compatible. Incompatible types are
defined in the IEEE SystemVerilog standard. For example, a tril pull-up net con
nected to a triO pull-down net through a module port is not compatible. Such a
connection will not be inferred by the dot-name shortcut.
These restrictions reduce the risk of unintentional connections being inferred by the
dot-name shortcut.
The dot-name inferred connection shortcut also resolves a hazard that exists with
the port order and named port connections — the shortcut will not infer an implicit
net.
When port order or named port connections are used, SystemVerilog will infer a net
declaration for any undeclared net names used in a netlist. Using inferred nets can be
convenient, in that it can save having to explicitly declare internal interconnecting
nets. However, inferred nets can also result in design bugs. An incorrectly typed name
in a netlist will not be an error. Instead, a net will be inferred for the mistyped name,
which will not be connected to other module instances in the netlist. The netlist will
compile and simulate or synthesize, but not function correctly. (Inferred nets are dis
cussed in Chapter 3, section 3.5.3, page 80.)
The dot-name shortcut will not infer an implicit net, and therefore helps to avoid
the hazards associated with implicit nets.
As with the dot-name shortcut, for a connection to be inferred, all nets must be
explicitly declared, the name and vector size must match exactly, and the types con
nected together must be compatible. Any connections that cannot be inferred by dot-
star must be explicitly connected together, using the full named port connection syn
tax, as shown in the following code snippet.
alu alu (.a(a_data),
.b(b_data),
* // infer all other connections
)/
With dot-star, the only connections that need to be listed are the ones where the port
name and the connecting net name are not the same. This can be an advantage versus
the dot-name shortcut because it makes these differences more obvious. A disadvan
tage, however, is that it is not easy to see what connections have been inferred by dot-
star. This can make code maintenance and debugging more difficult.
Another advantage of dot-star over dot-name is that the dot-star will not allow a
port to be inadvertently left unconnected. Unlike explicit named port connections and
the dot-name shortcut, the dot-star shortcut will not infer an unconnected port. Ports
must be explicitly shown as not having a connection using an empty parentheses,
such as . qb ().
2.5 Summary
As with all programming languages, SystemVerilog has specific syntax and seman
tic rules which must be followed when writing SystemVerilog models. This chapter
has examined the essential rules for writing RTL models. Important considerations
include the proper use of white space and comments, and proper naming conventions.
The SystemVerilog language has been around since 1984, and was originally called
Verilog. The language has evolved over this 30-year period. There have been several
versions of the Verilog and SystemVerilog standard. Each version of the standard has
reserved additional keywords that were not reserved in prior standards. SystemVer
ilog provides a pair of compiler directives, to help ensure that legacy models will
compile correctly with compilers that are based on a later version of the standard.
These directives are 'begin_keywords and 'end_keywords.
Large designs are partitioned into sub blocks, with each block represented as a
module. A higher level module is used to instantiate and connect these sub modules.
SystemVerilog provides two ways to connect modules: port order connections (which
have several hazards) and named port connections. Named port connections are more
verbose than port order connections, but can help prevent subtle, difficult to find, con
nection errors. The dot-name and dot-star named port connection shortcuts help sim
plify large netlists, while retaining the advantages of named port connections.
* * *
61
Chapter 3
Net and Variable types
Abstract — SystemVerilog has two major groups of data types, nets and variables.
There are a number of predefined types within these two groups, which are used to
model both designs and verification testbenches. The major concepts discussed in this
chapter are:
• Two-state and four-state values
• Literal values
• Variable types
• Net types
• Arrays of nets and variables
• Assignment rules for nets and variables
• Port types
• Parameter constants
uncertainty in how physical silicon would behave under specific circumstances, such
as when simulation cannot predict whether an actual silicon value would be a 0 or 1
(or Z for a tri-state device). For synthesis, an X value also provides design engineers a
way to specify “don’t-care” conditions, where the engineer is not concerned about
whether actual silicon will have a 0 or a 1 value for a specific condition.
Simple decimal literal integers. A literal integer value can be specified as a simple
number, such as the number 9, as shown in the following code snippet:
result = d + 9;
A simple literal number is treated by simulation and synthesis as:
• A 32-bit wide value
• A signed value
• A decimal value
• A 2-state value (no bits can be Z or X)
Chapter 3: Net and Variable types 63
These characteristics, along with the characteristics of d, will affect how the addi
tion is performed, and how the assignment to result is performed. These effects are
discussed in Chapter 5 on SystemVerilog operators and operations.
Signed literal integers. By default, a literal value with a base specified is treated as
an unsigned value in operations and assignments. This default can be overridden by
adding the letter s or S after the apostrophe and before the base specifier.
result = 'sd9 + 'sh2F + 'sblOlO;
Signed values are treated differently than unsigned values in certain operations and
in assignment statements. The effects of signed and unsigned values are discussed in
Chapter 5 on operators and operations.
Sized literal integers . By default, a simple literal number and a literal number with
a base specified are treated as 32-bit values in operations, programming statements,
and assignment statements. This default does not accurately represent hardware mod
els that use other vector sizes.
A value with a specific base can also have a specific bit-width specified. The num
ber of bits used to represent the value is specified before the apostrophe, signedness
and base specification.
result = 16'd9 + 8'h2F + 4'blOlO;
64 RTL Modeling with SystemVerilog for Simulation and Synthesis
NOTE
Synthesis compilers and lint checkers may generate warning messages when
the size of a literal value is not the same as the variable on the left-hand side
of an assignment statement. These size mismatch warning messages can hide
other messages that require attention. Using explicitly sized literal values
will prevent unintentional size mismatch warnings.
The use of octal values has been obsolete for decades. Literal decimal values can be
easily confused with other numbers. The old engineering joke applies here... “There
are 10 types o f people in the world, those that understand binary, and those that
don’t. ”
Simulators might report a non-fatal warning message when a truncation occurs, but
are not required to report a warning. Simulators silently will extend a literal value to
match a size, without generating any warnings. There is a risk of verifying design
functionality in simulation without realizing there was a size/value mismatch. Using a
lint program will reveal any mismatches in literal values.
NOTE
RTL synthesis compilers typically do not support real (floating-point)
expressions. High-level Synthesis (HLS) tools can be used for complex
arithmetic design. Floating point and fixed point design is outside the scope
of this book on RTL modeling.
SystemVerilog provides two general groups of data types, nets and variables. Nets
and variables have both a type and a data type. Type indicates that the signal is a net or
variable. Data type indicates the value system of the net or variable, which is either 2-
state or 4-state. For simplicity, this book uses the term data type to mean both the type
and data type of a signal.
Data types are used by software tools, such as simulators and synthesis compilers,
to determine how to store data and process changes on that data. Data types affect
operations, and are used in RTL modeling to indicate the silicon behavior desired. For
example, data types are used to determine if an adder should be integer based or float
ing-point based, and whether signed or unsigned arithmetic should be performed.
Chapter 3: Net and Variable types 67
Variables are required on the left-hand side of procedural block assignments. The
signals sum and out in the following code examples must be variables.
always_comb begin // combinational logic
sum = a + b;
end
Type Representation
An obsolete general purpose 4-state variable of a user-defined vec
reg
tor size; equivalent to var logic
Usually infers a general purpose var logic 4-state variable of a
logic user-defined vector size, except on module input/inout ports,
where wire logic is inferred
integer A 32-bit 4-state variable; equivalent to var logic [ 3 1 : 0 ]
A general purpose 2-state var variable with a user-defined vector
bit
size; defaults to a 1-bit size if no size is specified
A 32-bit 2-state variable; equivalent to var bit [ 3 1 : 0 ] ; synthe
int
sis compilers treat int as the 4-state integer type
byte An 8-bit 2-state variable; equivalent to var bit [ 7: 0]
shortint A 16-bit 2-state variable; equivalent to var bit [ 15: 0]
longint A 64-bit 2-state variable; equivalent to var bit [ 63: 0]
The use of 4-state variables allows simulators to use an X value when there is an
ambiguity as to what a value would be in actual hardware.
The context dependent logic data type. In almost all contexts, the logic data type
infers a 4-state variable the same as reg. The keyword logic is not actually a vari
able type, it is a data type that indicates a net or variable can have 4-state values.
However, a variable is inferred when the logic keyword is used by itself, or in con
junction with the declaration of a module output port. There is an exception where
Chapter 3: Net and Variable types 69
logic does not infer a variable. A net type will be inferred, when logic is used in
conjunction with the declaration of a module input or inout port.
The obsolete reg data type . The reg data type is an obsolete data type left over
from the original Verilog language. The logic type should be used instead of reg.
The original Verilog language used the reg data type as a general purpose variable.
Unfortunately, the use of keyword reg is a misnomer that might seem to be short for
“register”, a hardware device built with flip-flops. In actuality, there is no correlation
between using a reg variable and the hardware that is inferred. It is the context in
which a variable is used that determines if the hardware represented is combinational
logic or sequential flip-flop logic. Using logic instead of reg can help prevent this
misconception that a hardware register will be inferred.
An X value can indicate a design problem. When an X value occurs during simu
lation, it is often an indication that there is a design problem. Some types of design
bugs that will result in an X value include:
• Registers that were not reset or otherwise initialized.
• Circuitry that did not correctly retain state during low power mode.
• Unconnected module input ports (unconnected input ports float at high-impedance,
which often results in an X value as the high-impedance value propagates to other
logic).
• Multi-driver conflicts (bus contention).
• Operations with an unknown result.
• Out-of-range bit-selects and array indices.
• Setup or hold timing violations.
Avoid 2-state data types in RTL models. The bit, byte, shortint, int and
longint data types only store 2-state values. These types cannot represent a high-
impedance (Z value), and cannot use an X value to represent uninitialized or
unknown simulation conditions. An X value that indicates potential design bugs, such
as those in the list above, will not occur when 2-state data types are used. Since 2-
state data types can only have a 0 or 1 value, a design with errors can appear to be
functioning correctly during simulation. This is not good! An appropriate place to use
2-state variables is for randomized stimulus in verification testbenches.
Non synthesizable variable types. SystemVerilog has several variable types that are
intended primarily for verification, and are not generally supported by RTL synthesis
compilers. Table 3-2 lists these additional variable types. These data types are not
used in any examples in this book that are intended to be synthesized
70 RTL Modeling with SystemVerilog for Simulation and Synthesis
Type Representation
real A double precision floating-point variable
shortreal A single precision floating-point variable
A 64-bit unsigned 4-state variable with timeunit and timepreci-
time
sion attributes
realtime A double precision floating-point variable, identical to real
A dynamically sized array of byte types that can store a string
string
of 8-bit ASCII characters
A pointer variable that stores a handle to a simulation synchro
event
nization object
A pointer variable that stores a handle to a class object (the
class handle declaration type is the name of a class, not the keyword
class)
A pointer variable that stores pointers passed into simulation
chandle
from the SystemVerilog Direct Programming Interface
A pointer variable that stores a handle to an interface port (the
virtual interface
interface keyword is optional)
NOTE
The var keyword is seldom used in actual SystemVerilog code. Instead, the
var type is inferred from other keywords and context.
Scalar variables . A scalar variable is a 1-bit variable. The reg, logic and bit
data types default to 1-bit scalars.
logic v5; // a 1-bit scalar variable
logic v 6, v7, v8; // a list of three scalar variables
Signed and unsigned variables . The value stored in a vector variable can be treated
as either signed or unsigned in operations. An unsigned variable only stores positive
values. A signed variable can store positive and negative values. SystemVerilog uses
two’s-complement to represent negative values. The most significant bit of a signed
variable is the sign bit. When the sign bit is set, the remaining bits of the vector repre
sent a negative value in two’s-complement form.
By default, the reg, logic, bit and time data types are unsigned variables, and
the byte, shortint, int, integer, and longint data types are signed variables.
This default can be changed by explicitly declaring a variable as signed or
unsigned.
logic signed [23:0] vll; // 24-bit signed 4-state variable
int unsigned vl2; // 32-bit unsigned 2-state variable
Constant bit selects and part selects. A vector can be referenced in its entirety, or
in part. A bit select references a single bit of a vector. A bit select is performed using
the name of the vector, followed by a bit number in square brackets ( [ ] ). A part
select references multiple contiguous bits of a vector. A part select is performed using
the name of the vector, followed by a range of bit numbers in square brackets ( [ ] ).
72 RTL Modeling with SystemVerilog for Simulation and Synthesis
Variable bit and part selects. The bit select for msb in the preceding code snippet
used a hard-coded bit number. This is referred to as a fixed bit select. The index num
ber of a bit select can also be a variable. For example:
logic [31:0] data; // 32-bit vector variable
logic bit_out; // 1-bit scalar variable
always @(posedge shift_clk)
if (shift_enable) begin
for (int i=0; i<=31; i++) begin
@ (posedge shift_clk) bit_out <= data[i];
end
end
The starting point of a part select can also be variable. The part select can either
increment or decrement from the variable starting point. The total number of bits
selected is a fixed range. The form of a variable part select is:
[ starting_point_variable +: part_select_size ]
[ starting_point_variable - : part_select_size ]
The +: token indicates to increment from the starting point bit number. The - :
token indicates to decrement from the starting point bit number.
The following example uses a variable part select to iterate through the bytes of a
32-bit vector.
logic [31:0] data; // 32-bit vector variable
logic [ 7:0] byte_out; // 8-bit vector variable
always @ (posedge shift_clk)
if (shift_enable) begin
for (int i=0; i<=31; i=i+8) begin
@ (posedge shift_clk) byte_out <= data[i+:8];
end
end
Chapter 3: Net and Variable types 73
Variable bit and part selects are synthesizable. However, the preceding code snip
pets illustrating variable bit and part selects do not meet other RTL coding restrictions
required by some synthesis compilers. Chapter 8 discusses synthesis requirements for
sequential logic in more detail.
Vectors with subfields. Vectors can be declared with subfields by using two or more
sets of square brackets to define the vector range. The following code snippet shows
the difference between a simple 32-bit vector and a 32-bit vector with subfields:
logic [31:0] a; // 32-bit simple vector
logic [3:0] [7:0] b; // 32-bit vector, subdivided into
// 4 8-bit subfields
Figure 3-1 illustrates the difference in these two declarations.
NOTE
Uninitialized 2-state variables can hide design problems. An uninitialized 2-
state variable has the value of 0, which can appear to be a legitimate reset
value. This can potentially hide problems with reset logic in a design.
NOTE
In-line variable initialization is not supported in ASIC technologies, and
might be supported by some FPGA technologies.
When targeting a device that does not support programmable power-up
states, synthesis compilers will either: (a) not allow in-line initialization, or
(b) ignore it. A mismatch in RTL simulation behavior and the synthesized
gate-level implementation can occur when in-line initialization is ignored.
76 RTL Modeling with SystemVerilog for Simulation and Synthesis
For ASIC design, reset functionality should be used to initialize variables. Do not
use in-line initialization. For FPGA design, only use in-line initialization if it is cer
tain that the RTL model will always be targeted to a device that supports power-up
register states. The use of in-line initialization in RTL models effectively locks the
model to only be used that type of FPGA device.
Synthesis compilers and target FPGA devices that support in-line variable initial
ization also allow using initial procedures to model the power-up value of flip-
flops.
Sequential logic reset and the appropriate us of variable initialization in RTL mod
els. RTL models is discussed in Chapter 8, section 8.1.5 (page 286).
Nets are used to connect design elements together, such as connecting the output
port of one module to the input port of another module. Nets differ from variables in
three significant ways:
• Nets do not have temporary storage like variables. Instead, nets reflect the current
value of the driver(s) of the net. (A capacitive trireg net appears to store a value,
but is actually representing the behavior of a capacitor driving the net).
• Nets can resolve the resultant value of multiple drivers, where variables can only
have a single source (if multiple procedural assignments are made to a variable, the
last assignment is the resultant value, rather than resolving the result of all assign
ments).
• Nets reflect both a driver value (0, 1, Z or X) and a driver strength.
The strength level of a driver is represented in steps from 0 to 7. Each level is repre
sented by a keyword. The default strength level for most modeling constructs is
strong, which is a level 6. Strength levels are important for transistor-level modeling,
but are not used in RTL modeling. The representation and usage of strengths is out
side the scope of this book on RTL modeling.
Chapter 3: Net and Variable types 77
Type Representation
Non synthesizable net types. SystemVerilog has several net types that are not uni
versally supported by synthesis compilers, which are listed in Table 3-4 (page 77).
Type Representation
NO TE
Some RTL synthesis compilers might support one or more of these net types.
A best-practice coding style is to not use these types in order to ensure the
RTL model is compatible with any synthesis compiler. If one of these types
is used, design engineers should check that all tools used in the project sup
port that type.
Modeling CMOS technology. Most ASIC and FPGA devices are implemented
using CMOS technology. The behavior of CMOS interconnections is represented
using the wire and tri net types. The wire type is the most commonly used net
type, and is the default net type when nets are implicitly inferred (see section 3.5.3,
page 80).
Declaring input ports as variable type instead of net types. By default, input and
inout ports infer to a net type, specifically the wire type, unless the
'default_nettype specifies a different net type (see section 3.5.3, page 80). This
inference of a net type can result in hard-to-detect modeling errors where multiple
drivers are connected to the same input port (or a value is back driven onto the input
port from within a module). These modeling errors are legal in SystemVerilog
because net types permit multiple drivers.
Unintentional multiple drivers of an input port can be prevented by explicitly
declaring the input port as a var logic type. Variables do not permit multiple driv
ers. Inadvertent multiple drivers will be reported as a coding error when the design
modules are compiled and elaborated.
Using uwire to prevent multiple drivers. The uwire net type can also be used to
prevent inadvertent multiple drivers of an input port. The uwire type was added to
System Verilog as part of the 1364-2005 Verilog standard, specifically to make unin
tentional multiple drivers be a compile/elaboration error. An input port can be explic
itly declared uwire type, or the default net type can be changed to uwire, as
discussed in section 3.5.3 (page 80). The uwire type does not permit multiple driv
ers. Inadvertent multiple drivers will be reported as a coding error when the design
modules are compiled and elaborated.
NOTE
At the time this book was written, most synthesis compilers, and some simu
lators, had not yet to added support for the uwire type, even though it has
been part of the Verilog/SystemVerilog standard since 2005. The examples
in this book use the wire or tri type when a multi-driver net is required.
By default, all net types are unsigned. Nets can be explicitly declared as signed or
unsigned in the same way as variables. See section 3.4.2 (page 70).
Net bit and part selects. Any specific bit or group of bits can be selected from a net
vector using the same syntax as with variable vectors, as described in section 3.4.2
(page 70). Both constant and variable bit and part selects can be performed on nets.
always_comb begin
co = n2 | n3; // OK because n2 and n3 were
end // previously inferred as net types
The dot-name and dot-star inferred port connections (see sections 2.4.3, page 57,
and 2.4.4, page 58, in Chapter 2) do not infer implicit internal nets. Implicit nets that
are inferred from port declarations can be used with the dot-name and dot-star
inferred port connections, but all internal nets must be explicitly declared in order to
use these port connection shortcuts.
Changing the default implicit net type. The implicit net type can be changed using
the compiler directive ' default_nettype. The directive is followed by any Sys
tem Verilog net type. All System Verilog code that is compiled after the directive will
use the specified net type whenever an implicit net is inferred. The
'def ault_nettype must be specified outside of a module or interface boundary.
Example 3-2 defines the implicit net type to be the uwire (single driver) type,
always_comb begin
co = n2 | n3; // OK because n2 and n3 were
end // previously inferred as net types
endmodule: mixed_rtl_and_gate_adder
'default nettype wire // reset default for implicit nets
Turning off implicit net declarations. There are advantages and disadvantages of
implicit nets. Large, complex netlists will likely require several dozen 1-bit nets in
order to connect the design blocks. Having to explicitly declare these many nets is
tedious and time consuming. Explicitly declaring large numbers of interconnecting
nets can also require a lot of typing, with the inherit risk of typographical errors that
require debugging. Implicit nets can reduce the time required to write netlist models
and reduce typographical errors.
82 RTL Modeling with SystemVerilog for Simulation and Synthesis
Using implicit nets, or disabling implicit nets, is often a personal preference, and
sometimes a coding guideline within a company. The examples in this book assume
that implicit nets are enabled, and that the default implicit net type is wire.
NO TE
The ' default_nettype directive can affect multiple files. Compiler direc
tives are quasi-global in a compilation unit. When multiple files are com
piled in the same compilation unit, a compiler directive has no effect on any
files compiled before the directive is encountered, but does affect all files
compiled after the directive is encountered.
Setting the default net type back to wire after any module that changes the default
will prevent unintentional side effects of affecting other files that expect a default of
wire.
Chapter 3: Net and Variable types 83
Assigning values to nets. Nets can receive a value from two types of sources: as a
connection to an output or inout port, and as the left-hand side of a continuous assign
ment (an assign statement). Nets cannot be used on the left-hand side of procedural
assignments.
Continuous assignments are evaluated throughout simulation. Any changes on the
right-hand side of the assignment cause the right-hand side expression to be re-evalu-
ated and the left-hand side updated. The left-hand side can be a variable or a net. Con
tinuous assignments to a net can be explicit or implicit. An explicit continuous
assignment begins with the keyword assign.
wire [15:0] sum;
assign sum = a + b; // explicit continuous assignment
An implicit continuous assignment combines the declaration of a net and the
assignment to that net. The assign keyword is not used in the combination.
wire [15:0] sum = a + b; // net with implicit continuous
// assignment
Be careful not to confuse in-line variable initialization and implicit continuous
assignments.
logic [15:0] vl = a + b; // in-line variable initialization
Connection size mismatches. A net is used to connect design blocks together, such
as connecting an output port of one module to an input port of one or more other mod
ules. Typically, the vector widths of the ports and the interconnecting net are the
same, but SystemVerilog allows the vector sizes to be different. For example, a 16-bit
scalar net can connect a 32-bit wide output port to an 8-bit wide input port. This mis
match in sizes is probably a design error, but, in SystemVerilog, only a warning is
generated.
The SystemVerilog language has rules for resolving port/connection mismatches:
• A port has few er bits than the net or variable connected to it — the left-most bits of
the value are truncated, resulting in the most-significant bits of the value being lost.
84 RTL Modeling with SystemVerilog for Simulation and Synthesis
• A port has more bits than the net or variable connected to it — the value of the net
or variable is left-extended. If either the port, net/variable is unsigned, the value is
zero-extended. If both the port and the net/variable are signed, the value is sign-
extended.
Simulators and synthesis compilers will generate warning messages for connection
size mismatches. These warning should not be ignored! Connection mismatches are
usually a design error that needs to be corrected.
The dot-name and dot-star inferred port connections (see sections 2.4.3, page 57,
and 2.4.4, page 58, in Chapter 2) do not allow connection size mismatches.
Module definitions include a port list, which is enclosed in parentheses. Ports are
used to pass data into or out of a module. Modules can have four types of ports:
in p u t, o u tp u t, bidirectional in o u t, and interface. Input, output and inout ports are
discrete ports, where each port communicates a single value or user-defined type.
Interface ports are compound ports, that can communicate a collection of several val
ues. This section describes the syntax and usage guidelines of discrete ports. Interface
ports are described in Chapter 10.
• The port size can range from 1-bit wide to 2 16 (65,536) bits wide. In practice, engi
neers must consider the size limitations of the ASIC or FPGA technology that will
be used to implement the design.
Ports are declared in a module port list, which is enclosed in simple parentheses.
Ports can be listed in any order. Some engineers prefer listing inputs first, followed by
outputs. Other engineers prefer listing outputs first, followed by inputs. Some compa
nies have strict coding style rules regarding the order of ports, and other companies
leave the order up to the engineer(s) writing the module definition. Engineers also dif
fer widely on coding style regarding the use of indentation, and whether to list multi
ple ports on the same line or separate lines.
SystemVerilog provides three coding styles for declaring port lists and port declara
tions: combined-style, legacy-style and legacy-style with combined type and size.
Chapter 3: Net and Variable types 85
Combined-style port lists. The combined-style port list puts the full declaration of
each port within the port list parentheses. This style is preferred by most engineers.
module alu
(input wire logic signed [31:0] a, // 32-bit input
input wire logic signed [31:0] b, // 32-bit input
input wire logic [ 3:0] opcode, // 4-bit input
output var logic signed [31:0] result, // 32-bit output
output var logic overflow, // 1-bit output
output var logic error // 1-bit output
);
Observe that each port declaration is separated by a comma, and that the last port in
the list does not have a comma before the closing parenthesis.
Multiple ports of the same direction, type, data type and size can be declared using
a comma-separated list of port names. By combining the declarations of similar ports,
the preceding port list can be simplified as follows:
module alu
(input wire logic signed [31:0] a, b, // 32-bit inputs
input wire logic [ 3:0] opcode, // 4-bit input
output var logic signed [31:0] result, // 32-bit output
output var logic overflow, error // 1-bit
);
The IEEE System Verilog standard refers to the combined-style of port declarations
as an ANSI-style port list, because the style is similar to the ANSI C style for function
declarations. This style of port declaration was added to Verilog as part of the Verilog-
2001 standard.
Legacy-style port lists. The original Verilog-1995 standard separated the port list
and the declarations of the type, data type, sign and size of each port. The SystemVer-
ilog standard refers to this separated style as non-ANSI style port lists. This style is
similar to the original, pre-ANSI C style for function declarations. The following
example uses Verilog-2001 data types. The System Verilog logic type can also be
used with the legacy Verilog-style port list.
module alu (a, b, opcode, result, overflow, error);
input [31:0] a, b; // 32-bit inputs
input [ 3:0] opcode; // 4-bit input
output [31:0] result; // 32-bit output
output overflow, error; // 1-bit outputs
wire signed [31:0] a, b; // 32-bit nets
wire [ 3:0] opcode; // 4-bit net
reg signed [31:0] result; // 32-bit variable
reg overflow, error; // 1-bit variables
Observe that each port declaration is terminated by a semicolon, but a comma-sep
arated list of port names can be used for ports that have the same direction and size, or
86 RTL Modeling with SystemVerilog for Simulation and Synthesis
the same type, data type, and size (such as port a and b, or overflow and error in
the preceding port declarations).
If the port direction, type, data type, sign and size are all omitted on the first port in
the port list, then a legacy non-ANSI style port list is assumed for the entire port list.
All ports in a port list must be either the combined ANSI style or the legacy non-
ANSI style. It is illegal to mix the two styles in the same port list.
Legacy-style port lists with combined direction and size . The Verilog-2001 stan
dard allows the legacy-style port list to combine the direction declaration and the
type/data type declaration into a single statement.
module alu_4 (a, b, opcode, result, overflow, error);
input wiresigned [31:0] a, b; // 32-bit inputs
input wire [ 3:0] opcode; // 4-bit input
output reg signed [31:0] result; // 32-bit output
output reg overflow, error; // 1-bit output
Module port defaults. There are implicit defaults for each port’s direction, type,
data type, signedness, and size. The port type can be a net, such as wire, or a variable,
such as var. The port data type can be logic (4-state) or bit (2-state). The default
rules for port direction, type, data type, signedness and size are:
• No direction specified — The default direction for module ports is inout, but only
until a direction has been defined. Once a direction has been specified, that direc
tion applies to all subsequent ports until a new direction is specified.
• No type specified — The default type ports is wire when no data type, such as
logic, is specified. When a data type is specified, the default type is wire for
input and inout ports and var for output ports, and wire can be changed using the
' d e f a u l t _ n e t t y p e compiler directive, as described in section 3.5.3 (page 80).
• No data type specified — The default data type for all ports is logic (4-state).
• No signedness specified — The default signedness is the default signedness of the
port’s data type. The reg, logic, bit and time data types default to unsigned.
The byte, shortint, int, integer, and longint data types default to signed.
• No size specified — The default size is the default size of the port’s data type. The
reg, logic and bit data types default to 1-bit wide. The default size for other data
types is discussed in section 3.4.2 (page 70).
The following code snippet is not a realistic RTL coding style, but serves to illus
trate the implicit defaults of module port declarations.
module alu // IMPLICIT DEFAULTS:
(wire logic signed [31:0] a, b, // inout
wire logic [3:0] opcode, // inout, unsigned
output signed [31:0] result, // wire, logic
output var overflow, // logic, unsigned, 1-bit
output bit error // var, unsigned, 1-bit
);
Chapter 3: Net and Variable types 87
Although the port declarations in the preceding code snippet are synthesizable, it is
not a recommended coding style for synthesizable RTL models. Section 3.6.3 (page
88) provides some coding guidelines for port declarations.
System Verilog has two types of arrays: packed arrays and unpacked arrays. Packed
arrays are a collection of bits that are stored contiguously, and are commonly referred
to as vectors. Packed arrays are discussed in section 3.4.2 (page 70). Unpacked arrays
are a collection of nets or variables.
Each net or variable in the collection is referred to as an array element. Each ele
ment of an unpacked array is exactly the same type, data type and vector size. Each
unpacked array element can be stored independently from other elements; the ele
ments do not need to be stored contiguously. Software tools, such as simulators and
synthesis compilers, can organize the storage of unpacked arrays in whatever form the
tool deems optimal.
The basic declaration syntax of an unpacked array is:
type_or_data_type vector_size arrayjiam e array_dimensions
90 RTL Modeling with SystemVerilog for Simulation and Synthesis
The array_dimensions defines the total number of elements the array can store.
Unpacked arrays can be declared with any number of dimensions, with each dimen
sion storing a specified number of elements. There are two coding styles for declaring
the array dimensions: explicit addresses, and array size.
The explicit addresses style specifies the starting address and ending address of the
array dimension between square brackets, in the form:
[ start jiddress : endjiddress ]
The start_address and endjiddress can be any integer value. The array could start
with address 0, address 512, or whatever address is required for the hardware being
modeled. The range between the start and ending address represents the size (number
of elements) for the array dimension.
The array_size style defines the number of elements to be stored in square brackets
(similar to the C language array declaration style).
[ size ]
With the array_size style, the starting address is always 0, and the ending address is
always size - 1.
Some example unpacked array declarations are:
// a 1-dimensional unpacked array of 1024 1-bit nets
wire n [0:1023];
An array index can also be the value of a net or a variable, as in the next example.
always_ff @(posedge elk)
data <= mem[address]; // value of address is array index
Chapter 8, section 8.3 (page 317) discusses using arrays to model RAM functional
ity, and shows examples of complete synchronous and asynchronous RAM models.
The list syntax is similar to assigning a list of values to an array in C, but with the
added apostrophe before the opening brace. Using ' { as the opening delimiter shows
that enclosed values are a list of expressions, not the SystemVerilog concatenation
operator, as discussed in Chapter 5, section 5.11 (page 181).
A multidimensional array can also be assigned a list of values by using nested lists.
The nested sets of lists must exactly match the dimensions of the array.
logic [7:0] data [0:1] [0:3]; // 2-by-4 array layout
NO TE
The '{•••} list is not the same as the {...} concatenate operator, which is
described in Chapter 5, section 5.11 (page 181). The list operator treats each
value in the list as a separate value that corresponds to a separate element in
an array. The concatenate operator packs the values in the list into a vector.
All elements of an unpacked array can be assigned the same value by specifying a
default value. The default value is specified using ' {default:<value>}, as shown
in the following code snippet:
logic [7:0] lut [0:3]; // array with 4 elements
logic [7:0] a, b, c, d; // 4 8-bit variables
Passing arrays through portsand to tasks and functions. Unpacked arrays of any
type and any number of dimensions can be passed through module ports, or to task
and function arguments. The port or task/function formal argument must also be
declared as an array. The port or argument array must have an identical layout as the
array to be passed (the same rules as an arrays copy).
module cpu (...);
0 0 0
endmodule: cpu
module gpu
(input logic [7:0] lut [0:255] // array port
0 0 0 // other ports
) ;
endmodule: gpu
The original Verilog language only allowed simple vectors to be passed through a
module port, or to a task or function argument. To pass the values of the table array in
the example above would have required 256 ports, one for each element of the array.
Parameters are run-time constants, meaning the value of the parameter can be con
figured during compilation/elaboration time, and becomes fixed once simulation
starts running, or when synthesis begins the process of translating RTL functionality
into an ASIC or FPGA implementation. Another module can instantiate the
add_n_bits module and reconfigure parameter N for that instance, as shown in the
following code snippet.
module alu (/*
* ports not shown */);
logic [31:0] a, b, sum;
add_n_bits #(.N(32)) add32 (.*); // configure as 32 bits
endmodule: alu
Sections 3.8.1 and 3.8.2 (page 97) provide details on declaring and overriding mod
ule parameters.
Parameters can be declared in two places within a module: using a # (...) parameter
list before the module port list (as shown previously in example 3-4, page 94), or as
local declarations after the module port list.
Parameters declared within a module use a syntax similar to declaring a local vari
able or net. The general syntax is:
parameter data_type signedness size name = value_expression ;
the signedness are specified, the parameter takes on the signedness of the final value
assigned to the parameter.
size (optional) is specified in the same way as for variable and net vectors. If not
specified, the default size of the data type is used. If neither the data type nor the size
are specified, the parameter takes on the size of the final value assigned to it.
name can be any legal or escaped identifier name. A common convention is to use
all capital letters for constants, though there is no syntactic requirement to do this.
value_expression can be any expression that resolves to a valid value for the param
eter data type. The value expression must be a constant expression, which means it
must be able to be evaluated by a compiler, without running simulation. A constant
expression can use literal values, other constants, and calls to constant functions
(functions that do not have output or inout arguments, or external references).
Multiple parameters of the same explicit or implicit data type, signedness and size
can be declared as a comma separated list of names.
Some example local parameter declarations are:
module parameter_examples;
parameter SIZE = 256; // defaults to a logic signed [31:0]
parameter PI = 3.14; // defaults to a real data type
parameter string REV = "version 1.1a"; // explicit type
localparam bit [15:0] N = $clog2(SIZE); // explicit type
localparam [2:0] READY = 3'bOOl, // 3 constants, logic type
LOAD = 3'bOlO,
STORE = 3'blOO;
• • •
endmodule: parameter_examples
Parameters s i z e and pi above will take on the data type of the final value assigned
to them. The data type could change during compilation and elaboration if the param
eter values are overridden by an external assignment.
Parameters r e v and n have an explicit data type. The value assigned to the parame
ter must be compatible with the parameter’s data type, and the value will be converted
to that data type. This restriction also applies to any external assignments to the
parameter that override the declared value.
Parameters r e a d y ,l o a d and s t o r e have an explicit size, and an implicit data type
of logic. The value assigned to the parameter must be compatible with the parame
ter’s data type, and the value will be converted to that data type.
Parameters N, r e a d y ,l o a d and STORE are localparam parameters. As such, the
value of the parameter cannot be overridden by an external redefinition assignment.
However, the value of a localparam can be an expression that is calculated from other
parameters that can be overridden, as shown with parameter n .
96 RTL Modeling with SystemVerilog for Simulation and Synthesis
The value for parameter n is the value returned from a call to the $clog2 system
function. This is a constant function that returns the ceiling of the log base 2 of its
argument (the log rounded up to an integer value).
Module parameter lists. A parameter list is specified before the module port list,
and allows parameters to also be used to make the module ports configurable.
A parameter list is enclosed between the tokens # ( and ). The list can contain dec
larations for any number of parameters. The syntax is the same as for local parameter
declarations, with the exception that each declaration is separated by a comma instead
of a semicolon. Example 3-5 illustrates the use of a parameter list to model a configu
rable bus-functional RAM.
Observe in this code that the parameters in the parameter list are separated by com
mas, and that there is no comma or semicolon after the closing parenthesis of the
parameter list.
The parameter keyword is optional in a module port list. The # ( token indicates
that a parameter list is beginning, so software tools do not need the parameter key
word to know that parameters are being defined.
It is also optional to assign a value to parameters in a parameter list. If a parameter
does not have a value in its declaration, it is mandatory that a value be assigned exter
nally from a parameter override, as discussed in section 3.8.2 (page 97).
Chapter 3: Net and Variable types 97
assign sum = a + b;
endmodule: add type
The in-line named parameter redefinition style documents which parameters are
being overridden, and prevents inadvertent out-of-order errors.
endfunction: do magic
3.10 Summary
This chapter has examined the built-in types and data types that are predefined in the
SystemVerilog language. The focus has been on the types and data types that are use
ful for writing RTL models that will simulate and synthesize optimally.
System Verilog has both 2-state and 4-state data types. The four-value system of 4-
state data types allows accurately modeling hardware behavior. Values of 0, 1 and Z
represent physical hardware. The value X is used to model don’t-care conditions,
where a design engineer does not care if the physical hardware will have a 0 or 1
value. Simulators also use the value of X to indicate potential problems, where simu
lation cannot determine if actual logic gates would have a 0, 1 or Z. SystemVerilog’s
two-state types should not be used to model hardware behavior, because they do not
have the X value for representing potential design bugs during simulation.
SystemVerilog net types, such as the wire type, are used to connect design blocks
together. Nets always use 4-state data types, and can resolve a final value when there
are multiple sources driving the same net. SystemVerilog variable types are used to
receive values on the left-hand side of assignment statements, and will store the
assigned value until another assignment is made to the variable. SystemVerilog has
100 RTL Modeling with SystemVerilog for Simulation and Synthesis
several net types and variable data types. The syntax for declaring nets and variables
has been shown, and important semantic rules discussed. The proper usage of these
various nets and variables in RTL models has also been discussed.
Chapter 4
User-defined Types and Packages
Abstract — Engineers can extend the built-in SystemVerilog types with additional
user-defined types. User-defined types are a powerful modeling construct that allow
writing RTL models concisely and accurately. Models written with user-defined types
can be more easily reused in other projects. This chapter presents the syntax for
declaring user-defined types, along with many examples of using user-defined types
in RTL models. The major concepts discussed in this chapter are:
• User-defined type declarations
• Declaration packages
• Enumerated types
• Structures
• Unions
The original Verilog language did not have a place for definitions that would be
used in multiple modules. Each module had to have a redundant copy of tasks, func
tions, constants, and other shared definitions. A legacy Verilog coding style was to
place shared definitions into a separate file, which could then be included in other
files using the ' include compiler directive. This directive instructs the compiler to
copy the contents of the included file and literally paste those contents into the loca
tion of the 'include directive. Though using file inclusion helps to reduce code
redundancy, it is an awkward coding style for code re-use and maintenance.
SystemVerilog added packages to the original Verilog HDL. A package is a decla
ration space that can hold shared definitions. Multiple modules and interfaces can ref
erence these shared definitions either directly, or by importing specific package items,
or by importing the entire package. Packages solve the problem of having to duplicate
definitions in multiple modules and the awkwardness of using xxx to copy definitions
into multiple modules.
Chapter 4: User-defined Types and Packages 103
'ifdef _64bit
typedef logic [63:0] word_t;
'elsif _32bit
typedef logic [31:0] word_t;
'else // default is 16 bit
typedef logic [15:0] word_t;
'endif
typedef struct {
word_t a, b;
opcodes_t opcode_e;
} instruction_t;
The enum and struct constructs shown in Example 4-1 are discussed later in this
chapter. The word_t user-defined type definition in the example is within an 'ifdef
conditional compilation directive that defines word_t to be either a 16-bit vector, a
32-bit vector, or a 64-bit vector. The 'ifdef construct allows engineers to choose
what code will be compiled at the time the compiler is invoked. All design blocks that
104 RTL Modeling with SystemVerilog for Simulation and Synthesis
use the word_t user-defined type will use the word size that was selected when the
package was compiled.
The asterisk ( * ) is the wildcard token. A wildcard import effectively adds the
imported package to the search path that SystemVerilog uses.
When a SystemVerilog compiler encounters an identifier (a name), it will first
search in the local scope first for the definition of that identifier. A local scope can be
a task, function, begin-end block, module, interface or package. If the identifier defi
nition is not found in the local scope, the compiler will then search the next scope
Chapter 4: User-defined Types and Packages 105
In this example, the wildcard import has the effect of adding the
def initions_pkg package to the module’s identifier search path. The port list can
reference the instruction_t user-defined type, and the compiler will find that def
inition in the package. Likewise, the case statement can reference the enumerated
type labels used by opcode, and the definitions for those labels will be found in the
package.
Example 4-3, uses explicit imports to bring specific package items into a module.
The explicit imports are more verbose than using a wildcard import, but also make the
module more self-documenting. It is readily apparent exactly what package items are
being used from the package.
106 RTL Modeling with SystemVerilog for Simulation and Synthesis
always_comb begin
case (iw.opcode_e)
ADD : result = iw.a + iw.b;
SUB : result = iw.a - iw.b;
MULT: result = multiplier(iw.a, iw.b);
DIV2: result = iw.a >> 1;
endcase
end
endmodule: alu
NOTE
An explicit import of an enumerated type definition does not import the
labels used within that definition. The labels must also be explicitly
imported. Enumerated types are discussed in more detail in section 4.4 (page
114), and importing the complete enumerated type definition with its labels
is discussed in section 4.4.2 (page 117).•
NOTE
Placing a package import statement before the module port list was added in
the SystemVerilog-2009 standard. In SystemVerilog-2005, the import state
ment could only appear after the port list, or in the $unit space.
Legacy code written prior to SystemVerilog-2009 would sometimes have the
import statement outside of the module definition, causing the definitions in
a package to be imported into the dangerous $unit name space.
Example 4-4: Explicit package references using the :: scope resolution operator
module alu
(input definitions_pkg::instruction_t iw,
input logic elk,
output definitions pkg::word t result
)
always_ff 0 (posedge elk) begin
case (iw.opcode e)
definitions_pkg: ADD : result = iw.a + iw.b;
definitions_pkg: SUB : result = iw.a - iw.b;
definitions _pkg: MULT: result = definitions_pkg::
multiplier(iw.a, iw.b);
definitions_pkg::DIV2: result = iw.a >> 1;
endcase
end
endmodule: alu
Explicitly referencing package items can help to document the design source code.
In Example 4-4, the use of the package name makes it obvious where the definitions
for instruction_t, word_t, a d d , SUB, m u l t and multiplier can be found.
However, explicitly referencing the package name for every usage of a package item
is verbose. A more common way of using definitions in a package is to import the
entire package, as discussed previously, in section 4.2.2.1 (page 104). The explicit
package reference shown in this section is only needed when there a definition with
the same name in multiple packages, and it is necessary to indicate from which pack
age the item is to be imported.
108 RTL Modeling with SystemVerilog for Simulation and Synthesis
package gpu_pkg_pkg;
typedef enum logic [1:0] {MUL,DIV,SHIFTL,SHIFTR} op_t;
typedef enum logic {FIXED,FLOAT} operand_type_t;
typedef struct {
op_t opcode;
operand_type_t op_type;
logic [63:0] op_a;
logic [63:0] op_b;
} instruction_t;
endpackage: gpu_pkg_pkg
import cpu_pkg::*;
import gpu_pkg::*;
import cpu_pkg::*;
import gpu_pkg::*;
import cpu_pkg::instruction_t;
instruction_t instruction;
0 0 0
endmodule: processor
An explicit package item reference takes precedence over local definitions or an
explicit import of a package item, which takes priority over wildcard imports. The
explicit import in the preceding code snippet resolves the ambiguity for which defini
tion of instruction_t should be used in the processor module.
package alu_types_pkg;
import base_types_pkg::*; // import another package
typedef struct {
word64_t a, b;
opcodes_t opcode;
} instr_t;
endpackage: alu_types_pkg
module alu
import alu_types_pkg::*;
(input instr_t instruction, // OK: instr_t was imported
output word64_t result // ERROR: word64_t not found
); // in alu types pkg
0 0 0
endmodule: alu
In order for module alu to use definitions from both packages, both packages need
to be imported into alu. SystemVerilog has the ability to chain packages, so that a
module only has to import the last package in a chain, which would be
alu_types_pkg in the preceding code snippet. Package chaining is done by using a
combination of package import and export statements.
110 RTL Modeling with SystemVerilog for Simulation and Synthesis
package alu_types_pkg;
import base_types_pkg::*; // import another package
export base_types_pkg::*; // export (chain) imported items
typedef struct {
word64_t a, b;
opcodes_t opcode;
} instr_t;
endpackage: alu_types_pkg
An export statement can explicitly export a specific item, or use a wildcard to
export all items imported from another package. Note that, when using wildcard
exports, only the definitions actually used within the package will be exported. In the
preceding snippet, the definition for word32_t in base_types_pkg is not used
within alu_types_pkg, and therefore is not chained, and is not available in the alu
module.
The following explicit export could be added to the alu_types_pkg example
above to chain word32_t, so that it would be available in the alu module.
export base_types_pkg::word32_t; // chain word32 type
NOTE
At the time this book was written, some simulators and synthesis compilers
were not yet supporting package chaining. The export statement for package
chaining was added as part of the SystemVerilog-2009 standard. The Sys
temVerilog-2005 standard did not define a way to do package chaining.
compiler to read in the file containing the package. The ' include directive is placed
at the beginning of each design or testbench file that contains references to the pack
age items.
// read in the package used by this model
'include "definitions_pkg.sv" // compile the package
module alu
import definitions_pkg::*; // wildcard import
(input instruction_t iw,
input logic elk,
output word_t result
);
0 0 0
endmodule: alu
Care must be taken when using the 'include directive avoid including the same
package multiple times in the same compilation, which SystemVerilog does not allow.
This can be done by placing ' ifdef (“if defined”) or ' ifndef (“if not defined”)
conditional compilation directives around the package definition, so that the compiler
skips over the entire package if it has already been compiled. Conditional compilation
allows System Verilog source code to be optionally compiled, based on whether a
macro name has been defined using a 'define compiler directive.
The following example surrounds the package with an 'ifndef conditional com
pilation directive. The first time the file containing the package is read in by a com
piler, the “not defined” test will be true, and the package will be compiled. The lines
of code that are compiled contain a ' define directive that sets the macro name used
by the ' ifndef. If this file is read in a second time during the same invocation of the
compiler, the “not defined” test will be false, and code between the 'ifndef and
'endif will be skipped.
// Only compile this package if its internal conditional
// compilation flag has not been set. This file sets its
// internal flag the first time it is compiled.
//
'ifndef DEFINITIONS_PKG_ALREADY_COMPILED //if flag is not set
'define DEFINITIONS_PKG_ALREADY_COMPILED // set the flag
package definitions_pkg; // and compile pkg
... // package item definitions
endpackage: definitions_pkg
'endif // end of conditionally compiling this package
in simulation, that storage would be shared by all references to the task or function.
This would be a different behavior than having duplicate copies. By declaring a task
or function as automatic, new storage is allocated each time it is called, making the
behavior the same as a unique copy of the task or function. This ensures that the sim
ulation behavior of the pre-synthesis reference to the package task or function will be
the same as post-synthesis behavior.
For similar reasons, synthesis does not support variable declarations in packages. In
simulation, a package variable is shared by all modules that import the variable. One
module can write to the variable, and another module will see the new value. This
type of inter-module communication without passing values through module ports is
not synthesizable.
NOTE
The $unit is a dangerous shared name space that is fraught with hazards. Its
use can lead to designs that are difficult to compile and to maintain.*•
endmodule: alu
endmodule: decoder
$unit can contain the same kinds of user definitions as a package, and has the
same synthesis restrictions. Unlike a package, however, the $unit space can lead to
design code that is difficult to maintain, and difficult for software tools to compile.
Some of the hazards of using $unit are:
• Definitions in $unit can be scattered across many files, making code maintenance
and code reuse a nightmare.
Chapter 4: User-defined Types and Packages 113
When a user-defined type, task, function, or other identifier name from a package
is referenced, it is relatively easy to locate and maintain the definition of the iden
tifier. There is always an explicit package reference or a package import state
ment to show where the definition can be found. When a user-defined type, task,
function, or other identifier is defined in the $unit space, the definition could be
in any file, in any directory, on any server, that makes up the source code of the
design and verification testbench. Locating, maintaining, and reusing the defini
tion is difficult, at best.
• When definitions in the $unit space are in multiple files, the files must be com
piled in a very specific order.
SystemVerilog requires that definitions be compiled before they are referenced.
When $unit declarations are scattered across many files, it can be difficult, and
even impossible, to compile all files in the proper order.
• A change to a $unit definition requires recompiling all source code files.
Any change to a definition in $unit will necessitate recompiling all source code
that makes up the design and the verification testbench, since any file, anywhere,
could use the definition without importing it. Many software tools will not man
date that all files be recompiled, but, if not recompiled, design blocks could end
up with obsolete definitions.
• The scope of $unit can be, and often is, different for simulation and synthesis.
Each invocation of a compiler starts a new $unit space that does not share dec
larations that are in other $unit spaces. Many SystemVerilog simulators compile
multiple files together. These tools will see a single $unit space. A $unit defi
nition in one file will be visible to any subsequent file in the single compilation.
Most SystemVerilog synthesis compilers, and some simulators, support separate
file compilation, where each file can be compiled independently. These tools will
see several disconnected $unit spaces. A $unit definition in one file will not
be visible to any other file.
• Duplicate identifier names with different definitions can easily occur.
It is illegal in SystemVerilog to define the same name multiple times in the same
name space. If one file defines a bool_t user-defined type in the $unit space,
and another file also defines a bool_t user-defined type in $unit, the two files
can never be compiled together, since the two definitions would end up in the
same $unit space. To avoid this conflict, engineers must add conditional compi
lation directives using 'define and 'ifdef, so that only the first definition
encountered by the compiler is actually compiled.
Packages can be imported into $unit, but have all the same hazards as definitions
made directly in $unit. Furthermore, care must be taken to not import the same
package into the same name space more than once, which is illegal.
114 RTL Modeling with SystemVerilog for Simulation and Synthesis
Packages avoid all of the hazards of $unit. Using packages provides a controlled
name space that is easier to maintain and reuse.
Enumerated types provide a means to declare a variable that can have a specific list
of valid values. Each value is associated with a label. An enumerated variable is
declared with the enum keyword, followed by a comma-separated list of labels
enclosed in curly braces ( { } ).
In the following example, variable rgb can have the values of red,green and
blue:
• w a i t e ,the
first label in the list, has a value of 0, l o a d a value of 1, and r e a d y a
value of 2. (The label w a i t e is purposely spelled with an “E” at the end to avoid
any confusion or conflict with the reserved keyword wait in SystemVerilog.)
These defaults are seldom ideal for modeling hardware. The int base type is a 2-
state type, which means any design problems that result in an X during simulation
cannot be reflected in the enumerated variable. The int base type is 32-bits wide,
which is usually a much larger vector size than the hardware being represented
requires. Label values such as 0, 1 and 2 do not represent the encoding used in many
types of hardware designs, such as one-hot values, Gray codes, or Johnson counts.
Specifying the base type and label values has several advantages: It documents the
design engineer’s intent, it can more accurately model the gate-level behavior, and it
allows more accurate RTL to gate-level logic equivalence checking (see Chapter 1,
section 1.8, page 36).
The following code fragment will result in an error, because the enumerated label
GO is used twice in the same module scope:
module controller (...);
enum logic {GO=l'bl, STOP=l'bO} fsml_states_e;
0 0 0
This error in the preceding example can be corrected by placing at least one of the
enumerated type declarations in a begin-end block, which has its own name scope.
module controller (...);
0 0 0
end: fsml
end: fsm2
Giving names to the begin-end blocks as shown above is not required, but helps to
document the code for readability and maintenance.
NOTE
An explicit import of an enumerated type definition does not import the
labels used within that definition.
Using a wildcard import of a package is the easiest solution to this limita
tion. The wildcard import makes everything in the package available (see
section 4.2.2.1, page 104).
When a typed enumerated type definition is imported from a package, only the
typed name is imported. The value labels in the enumerated list are not automatically
imported and made visible in the name space in which the enumerated type name is
imported. The following code snippet will not work.
118 RTL Modeling with SystemVerilog for Simulation and Synthesis
package chip_types_pkg;
endpackage: chip_types_pkg
endmodule: chip
In order to also import the enumerated type labels, either each label must be explic
itly imported, or the package must be wildcard imported. A wildcard import will
make both the enumerated type name and the enumerated value labels visible in the
scope of the import statement. The following partial example shows the use of a wild
card import.
module chip (...);
Care must be taken when doing wildcard imports from multiple packages. A com
pilation or elaboration error will occur if an identifier (a name) is defined in more than
one package, and both packages are wildcard imported. For this situation, the identi
fier to be used must be either explicitly imported or directly reverenced. Working with
multiple packages is discussed in section 4.2.3 (page 108).
state = next state; // legal: state, next state are same type
N O TE
Casting expressions to enumerated types. Any value can be cast to a typed enu
merated type, and then assigned to a variable of that enumerated type, even if the
value does not match one of the labels for the enumerated definition.
state = states_t'(temp); // legal, even if value of temp
// does not match any of the label
// values
N O TE
At the time this book was written, enumerated type methods were supported
by some synthesis compilers, but were not universally supported by all syn
thesis compilers.*•
The enumerated type methods have limited usefulness for modeling hardware
behavior. They are merely shortcuts for what can be done with assignment statements.
Due to the synthesis limitations on enumerated type methods, this book only briefly
describes these methods and shows a simple example.
Enumerated methods are called by appending the method name to the end of the
enumerated type variable name, with a period as a separator. The methods are:
• enum_variable_name. first — returns the value of the first member in the enu
merated list of the specified variable.
• enum_variable_name. last — returns the value of the last member in the enumer
ated list.
• enum_variable_name. next (N) — returns the value of the next member in the
enumerated list, based on the current value of the enumerated type variable.
Optionally, an integer value can be specified as an argument to next. In this case,
the Nth next value in the enumerated list is returned. When the end of the enumer
ated list is reached, the method wraps back to the start of the list. If the current
value of the enumerated type variable does not match any member of the enumer
ated list, the value of the first member in the list is returned.
• enum_yariable_name. prev (N ) — returns the value of the previous member in the
enumerated list, based on the current value of the enumerated type variable. This
method works the same as the next method, except that the prev method iterates
backwards through the list of labels instead of forward.
• enum_variable_name. num — returns the number of labels in the enumerated list of
the variable.
• enum_variable_name. name — returns the string representation of the label for the
current value in the enumerated type variable. If the value is not a member of the
enumeration, the name method returns an empty string.
122 RTL Modeling with SystemVerilog for Simulation and Synthesis
Printing enumerated types. Enumerated type values can be printed as either the
actual value of the label, or as the name of the label. Printing the enumerated type
variable directly will print the current actual logic value of the enumerated type vari
able. Using the name method allows printing the label representing the current value
instead of the actual value.
Figure 4-1 shows the state flow for this state machine. The state machine represents
a counter that can be either incremented or decremented. The counter counts how
many consecutive data_matches have occurred, up to a maximum of 16. Observe
that, for most states, the counter is either incremented by 1, or decremented by 2. The
next and prev enumerated type methods can model this increment or decrement
behavior very concisely, but might not be supported by some synthesis compilers.
Example 4-5: Using enumerated type methods for a state machine sequencer
module confidence_counter
(input logic data_matches, compare_en, rstN, elk,
output logic data_synched
);
4.5 Structures
typedef enum logic [2:0] {NOP, ADD, SUB, MULT, DIV} opcode In
struct {
int a, b; // 32-bit 2-state variables
opcode_t opcode; // user-defined type
logic [23:0] address; // 24-bit variable
bit error; // 1-bit 2-state variable
} instruction word;
A structure can bundle together any number of variable data types, including user-
defined types. Parameter and localparam constants can also be included in a structure.
A parameter in a structure cannot be redefined like parameters in modules. Parame
ters in structures are treated as localparams.
The values in the structure expression must be listed in the order in which they are
defined in the structure, as shown in the preceding example. Alternatively, the struc
ture expression can specify the names of the structure members to which values are
being assigned, where the member name and the value are separated by a colon. The
member names within the structure expression are referred to as tags. When member
names are specified, the expression list can be in any order.
instruction_word <= '{address:0, opcode:SUB,
a:100, b :7, error:'1};
It is illegal to mix the by-name and by-order in the same structure expression.
Enumerated types in structures. The previous two examples with default values
have a semantic error. The default value assigned to structure members must be com
patible with the data type of the member. Since most SystemVerilog variables are
loosely typed, almost any default value will be compatible. Enumerated type vari
ables, however, are more strongly typed. Assignments to an enumerated type variable
must be either a label from its enumerated list, or another enumerated variable of the
same enumerated type definition (enumerated type assignment rules are discussed in
section 4.4.3, page 118).
The two assignment statements to instruction_word above attempt to assign
opcode a default value of 0. This is an illegal value for opcode, which is an
opcode_t enumerated type variable (the typedef definition for opcode_t is shown
in section 4.5.1, page 124). When a member of a structure is an enumerated type vari
able, the structure expression must specify a legal explicit value for that member. A
default value can be specified for all other members. For example:
always_ff @ (posedge elk)
if (IrstN) instruction_word <= '{opcode:NOP, default:0};
else
vector. The right-most bit of the last member in the structure is the least-significant bit
of the vector, and is numbered as bit 0. This is illustrated in Figure 4-2.
40 39 31 15 0
Signed packed structures. Packed structures can be declared with the signed or
unsigned keywords. These modifiers affect how the entire structure is perceived
when used as a vector in mathematical or relational operations. They do not affect
how members of the structure are perceived. Each member of the structure is consid
ered signed or unsigned, based on the type declaration of that member. A part-select
of a packed structure is always unsigned, the same as with part selects of vectors.
Chapter 4: User-defined Types and Packages 129
always_comb begin
It = 0; gt = 0;
if (dl < d2) It = '1; // signed comparison
else if (dl > d2) gt = '1;
end
typedef struct {
logic [31:0] a, b;
opcode_t opcode;
logic [23:0] address;
} instruction_word_t;
endpackage: definitions_pkg
module alu
import definitions_pkg::*; // wildcard import
(input instruction_word_t iw, // user-defined port type
input wire elk
);
0 0 0
endmodule
An unpacked structure must be a typed structure in order to pass the structure
through ports. The connections to the port must be a structure of the exact same type
as the port. That is, both the port and the connections on both sides of the port must all
be declared from the same typedef definition. This restriction only applies to
unpacked structures. A packed structure passed through a module port is treated like a
vector. The external connection to the port can be a packed structure of the same type,
or any type of vector.
Typed structures can also be passed as arguments to a task or function by declaring
the task or function argument as the structure type.
130 RTL Modeling with SystemVerilog for Simulation and Synthesis
endfunction: calculate_result
endmodule: processor
When a task or function is called that has an unpacked structure as a formal argu
ment, a structure of the exact same type must be passed to the task or function. A
packed structure formal argument is treated as a vector, and can be passed to any type
of vector.
4.6 Unions
A union is a single storage element that can have multiple data type representations.
The declaration of a union is similar to a structure, but the inferred hardware is very
different. A structure is a collection of several variables. A union is a single variable,
that can a data type at different times. The variable types a union can store are listed
between curly braces ( { } ), with a name for each variable type.
union {
int s ;
int unsigned u;
} data;
The variable is data, in this example. The data variable has two possible data
types: a signed integer type named s, or an unsigned integer value named u.
A typical application of unions in RTL modeling is when a value might be repre
sented as several different types, but only as one type at any specific clock cycle. For
example, a data bus might sometimes carry a packet of data using the User Network
Interface (UNI) telecommunications protocol. At other times, the same data bus
might carry a packet of data using the Network to Network Interface (NNI) telecom
munications protocol. A SystemVerilog union can represent this dual usage of the
same bus. Another usage of unions is to represent a shared hardware resource, such as
a hardware register that can store different types of data at different times.
SystemVerilog has three types of unions: unpacked unions, packed unions and
tagged unions. Most synthesis compilers only support packed unions.
Unpacked and tagged unions are not supported by most synthesis compilers. These
union types can represent storage for any data type, including data types that are not
synthesizable. Unpacked unions and tagged unions can be useful for modeling test-
benches and high-level abstract models, but should not be used for RTL modeling.
Packed unions are declared by adding the keyword packed immediately after the
union keyword.
typedef union packed { // p a c k e d u n i o n t y p e
int s ;
int unsigned u ;
} data_t;
A packed union allows data to be written using one format, and read back using a
different format. The design model does not need to do any special processing to keep
track of how data was stored. This is because the data in a packed union will always
be stored using the same number of bits. The following example defines a packed
union in which a value can be represented in two ways: either as a data packet (using
a packed structure) or as an array of contiguous bytes.
typedef struct packed {
logic [15:0] source_address;
logic [15:0] destination_address;
logic [23:0] data;
logic [ 7:0] opcode;
} data_packet_t;
union packed {
data_packet_t packet; // packed structure
logic [7:0] [7:0] bytes; // packed array
} dreg;
Figure 4-3 illustrates how the two data types of dreg aee represented.
Figure 4-3: Packed union with two representations of the same storage
63 47 31 7 0
packet source addr destination addr data opcode
63 55 47 39 31 23 15 7 0
bytes bytes [7] bytes[6] bytes[5] bytes[4] bytes[3] bytes[2] bytes[1] bytes [0]
Because the union is packed, the information will be stored using the same bit
alignment, regardless of which union representation is used. This means a value could
be loaded using the bytes format (perhaps from a serial input stream of bytes), and
then the same value can be read using the data_packet format.
always_ff @(posedge elk, negedge rstN) // async reset
if (IrstN) begin // active-low reset
dreg.packet <= 0; // reset using packet type
i <= 0;
end
else if (load_data) begin
dreg.bytes[i] <= data_in; // store using bytes type
i++;
end
else if (data_ready) begin
case (dreg.packet.opcode) // read as packet type
II...
endcase
end
134 RTL Modeling with SystemVerilog for Simulation and Synthesis
endpackage: definitions_pkg
Example 4-7: Arithmetic Logical Unit (ALU) with structure and union ports
module alu
import definitions_pkg::**; // wildcard import the package
(input logic elk, rstN,
input instr_t iw, // input is a structure
output data_t alu_out // output is a union
);
Figure 4-4 shows the result of synthesizing this example. The schematic image is
too small to be meaningful because the page size of this book, but illustrates two
important characteristics of using structures and unions in RTL models:
• Structures and unions can concisely model a significant amount of functionality.
The ability to model more functionality with fewer lines of code is one of the rea
sons features such as structures and unions were added to the original Verilog.
136 RTL Modeling with SystemVerilog for Simulation and Synthesis
• Unions, when used with the RTL coding guidelines described in this section, can
represent multiplexed functionality, allowing multiple resources (signed and
unsigned adders, subtracters, multipliers and dividers in this example) to share the
same hardware registers. The circles in Figure 4-4 represent generic arithmetic
operations, The trapezoidal symbols represent multiplexors.
Figure 4-4: Synthesis result for Example 4-7: ALU with structure and union ports
Structures and unions can include packed or unpacked arrays. A packed structure or
union can only include packed arrays.
typedef struct { // unpacked structure
logic data_ready;
logic [7:0] data [0:3]; // unpacked array
} packet_t;
The inputs to this instruction register include separate operands, an opcode, and a flag
indicating if the operands are signed or unsigned. The model loads these separate
input values into an instruction register array. A write pointer input controls where the
data is loaded. The output of the model is a single instruction structure, selected from
the instruction register using a read pointer input.
This example uses the same package items shown previously in Example 4-6 (page
134).
input data t
0
1
"d
CD
‘C
>
input op t opcode,
input logic [4:0] write pointer,
input logic [4:0] read pointer,
output instruction t iw
);
Figure 4-5 shows the result of synthesizing this example. The schematic image is
too small to be readable in the page size of this book, but illustrates how structures
and unions, arrays can be used to model a significant amount of design functionality
with very few lines of code. The rectangular symbol towards the upper-right of the
schematic is an instance of a generic RAM that the synthesis compiler chose to repre
138 RTL Modeling with SystemVerilog for Simulation and Synthesis
sent the storage of the array in the RTL model. The synthesis compiler will implement
this generic RAM as one or more synchronous storage devices in the final step of syn
thesis, where the generic gate-level functionality is mapped to a specific ASIC or
FPGA device.
Figure 4-5: Synthesis result for Example 4-8: instruction register with structures
4.8 Summary
The topics presented in this chapter provide powerful ways to manage complex
design data in a concise, maintainable and reusable form.
All of the topics discussed in this chapter were added to the original Verilog language
as part of the newer System Verilog generation of the language.
141
Chapter 5
RTL Expression Operators
Abstract — Chapter 5 explores the programming operators that are used for RTL
simulation and synthesis. Operators evaluate one or more expressions and determine
a result. For example, the arithmetic + operator adds two expressions together and
returns the sum. The operation could be performed as unsigned integer, signed inte
ger, or floating-point, with out with a carry, and with a 2-state or 4-state result. Under
standing the rules of SystemVerilog operators is essential for writing RTL models that
simulate and synthesize correctly. The topics covered in this chapter include:
• 2-state and 4-state operations
• X-optimism and X-pessimism
• Expression vector sizes
• Concatenate and replicate operators
• Conditional (ternary) operator
• Bitwise operators
• Unary reduction operators
• Logical operators
• Comparison operators (equality and relational)
• Case equality (identity) operators
• Set membership (inside) operator
• Shift operators
• Streaming operators (pack and unpack)
• Arithmetic operators
• Increment and decrement operators
• Assignment operators
• Cast operators
142 RTL Modeling with SystemVerilog for Simulation and Synthesis
Operators perform operations on operands. Most operators have two operands. For
example, in the operation a + b, the operands of the + (add) operation are a and b.
Each operand is referred to as an expression. An expression can be a literal value, a
variable, a net, the return of a function call, or the result of another operation. Expres
sions have a number of characteristics which affect how an operation is performed.
These characteristics are discussed in sections 5.1.1 through 5.1.5.
The result of the operation is the value 4 'bOOOO. This is because the & operator
models a digital AND logic gate for each bit of its operands. In digital logic, a 0
ANDed with any value will result in a 0. The high-impedance bit (represented by a Z)
and the unknown bit (represented by an X) in operand a become zeros in the result
because these bits are ANDed with their corresponding bits in b, which have a value
of 0. This behavior is referred to as X-optimism. Simulation will have a known result,
even though an operand has bits with X or Z values.
X-optimism only applies to values where simulation can accurately predict how
actual logic gates behave. In the following example, the b operand is all ones instead
of all zeros.
logic [15:0] a;
logic [31:0] b;
logic result;
The size context for arithmetic operations is more complex than that of other opera
tors. The size context takes into account not only the operands of the operator, but
also the vector size of all expressions on both the right-hand side and left-hand side of
an assignment statement, as shown in the following code:
NOTE
RTL synthesis compilers typically do not support real (floating-point)
expressions. High-level Synthesis (HLS) tools can be used for complex
arithmetic design. Floating point and fixed point design is outside the scope
of this book on RTL modeling.
The concatenate and replicate operators join multiple expressions together to form
a vector expression. The total number of bits in the resultant vector is the sum of all
the bits in each sub expression. There are two forms of concatenations, simple and
replicated. A simple concatenate joins any number of expressions together. A repli
cated concatenation joins expressions together and then replicates that result a speci
fied number of times. Table 5-3 shows the general syntax and usage of the
concatenate and replicate operators.
The following variables and values are used to show the results of these operators.
Examples 5-1 and 5-2 illustrate two common applications of the concatenate opera
tor in RTL modeling: joining multiple signals together on the right-hand or left-hand
side of an assignment statement. Following each example, Figures 5-1 and 5-2 show
how the concatenate operators disappear in the gate-level functionality generated by
synthesis. Nevertheless, the concatenate operators are a useful construct for represent
ing hardware functionality in a concise way in RTL models.
Figure 5-1: Synthesis result for Example 5-1: Concatenate operator (status register)
NOTE
How synthesis compilers implement an operator can be influenced by a
number of factors, including: the target device, other operators or program
ming statements used in conjunction with the operator, the synthesis com
piler utilized, and the synthesis options and constraints that were specified.
Chapter 5: RTL Expression Operators 149
The status register in Example 5-1 has two unused bits, which have a constant value
of 1. The synthesis compiler used to generate the implementation of the status register
shown in Figure 5-1 mapped these two unused bits to a simple pull up value on the 8-
bit output. Other synthesis compilers, or specifying different synthesis constraints,
might map this same RTL functionality differently, such as by using flip-flops that are
preset to a value of 1.
Figure 5-2: Synthesis result for Example 5-2: Add operator (adder with carry in/out)
cussed in Chapters 3, section 3.7.3 (page 91) and 4, section 4.5.3 (page 125).
Although the assignment list operator appears similar to a concatenate operator, the
functionality is very different. The concatenate operator joins several values together
to create a new, single value. The assignment list operator begins with an apostrophe
( ' ), and is used to assign a collection of individual values to the individual elements
of an array or the individual members of a structure.
A widely used operator in RTL modeling is the conditional operator, which is also
referred to as a ternary operator. This operator is used to choose between two expres
sions. The tokens used to represent the conditional operator are listed in Table 5-2.
The expression listed before the question mark ( ? ) is referred to as the control
expression. It can be a simple integral value (a vector of a any size, including 1-bit) or
the result of another operation that returns an integral value. For example:
logic sel, mode, enableN;
logic [7:0] a, b, yl, y2;
assign yl = sel ? a : b;
assign y2 = (mode & !enableN)? a + b: a - b;
The control expression is evaluated as true or false based using the following rules:
• The expression is true if any bit is 1.
• The expression is false if all bits are 0.
• The expression is unknown if no bits are set and not all bits are 0, which can occur
if there are some bits that are X or Z.
With 4-state values, it is possible for a control expression to be neither true nor
false. In the following value, none of the bits are 1, but not all of the bits are 0.
Example 5-3: Using the conditional operator: multiplexed 4-bit register D input
module muxed_register
# (parameter WIDTH = 4) // register size
(input logic elk, // 1-bit input
input logic data_select, // 1-bit input
input logic [WIDTH-1:0] dl, d2, // scalable input size
output logic [WIDTH-1:0] q_out // scalable output size
);
Figure 5-3: Synthesis result for Example 5-3: Conditional operator (mux’ed register)
The circuit shown in Figure 5-3 is the intermediate generic synthesis result, before
the synthesis compiler has mapped the circuit to a specific ASIC or FPGA target
implementation. The synthesis compiler used to produce Figure 5-3 utilized generic
flip-flops with unused set and reset inputs. The final implementation using an ASIC
or FPGA library might be able to use flip-flops that do not have these inputs, if avail
able in the target device. A different synthesis compiler might use different generic
components to represent these intermediate results.
152 RTL Modeling with SystemVerilog for Simulation and Synthesis
Example 5-4: Using the conditional operator: 4-bit adder with tri-state outputs
module tri_state_adder
# (parameter N = 4) // N-bit adder size
(input logic enable, // output enable
input logic [N—1:0] a, b, // scalable input size
output tri logic [N—1:0] out // tri-state output, net type
);
In this example, the conditional operator ( ? : ) selects whether the out port
should be assigned the value of (a + b) or hi-impedance. If en is false, out is
assigned ' z. The ' z token is a literal value that sets all bits of an expression to hi-
impedance, and automatically scales to the vector-size of the expression. See section
3.2.2 (page 65) in chapter 3 for more details on vector fill literal values.
Observe in Example 5-4 that the out tri-state output port is declared as a
tri logic type, instead of the usual logic type. The logic data type only defines
that the port can have 4-state values. It does not define whether the port type is a net
or variable type. An output port will default to a variable type, unless explicitly
declared as a net type. (Conversely, an input port will default to a net type, unless
explicitly declared as a variable). The tri keyword declares a net type. The tri type
is the same as the wire type in every way, but the tri keyword can help document
that the net or port is expected to have tri-state (hi-impedance) values.
Figure 5-4: Synthesis result for Example 5-4: Conditional operator (tri-state output)
Bitwise operators perform their operations one bit at a time, working from the
right-most bit (the least-significant bit) towards the left-most bit (the most-significant
bit). Table 5-3 lists the bitwise operators.
Bitwise inversion. The bitwise invert operator inverts each bit of its single operand,
working from right to left. The result is a one’s complement of the operand value. The
bitwise inversion operator is X-pessimistic — the result of inverting an X or Z value
is always an X. Table 5-4 shows the truth table for the bitwise inversion. The results
in the table are for each bit of the operand.
result
0 1
1 0
X X
Z X
Bitwise AND. The bitwise AND operator does a Boolean AND of each bit of the
first operand with the corresponding bit in the second operand, working from right to
left. The bitwise AND operator is X-optimistic: a 0 ANDed with any value will result
in a 0. Table 5-5 shows the truth table for the bitwise AND. The results in the table are
for each bit of the two operands.
& 0 1 X z
0 0 0 0 0
1 0 1 X X
X 0 X X X
Z 0 X X X
Bitwise OR. The bitwise OR operator does a Boolean OR of each bit of the first
operand with the corresponding bit in the second operand, working from right to left.
The bitwise OR operator is X-optimistic — a 1 ORed with any value will result in a 1.
Table 5-6 shows the truth table for the bitwise OR.
1 0 1 X z
0 0 1 X X
1 i 1 1 1
X X 1 X X
z X 1 X X
Bitwise XOR. The bitwise XOR operator does a Boolean exclusive-OR of each bit
of the first operand with the corresponding bit in the second operand, working from
right to left. The bitwise XOR operator is X-pessimistic — the result of exclusive-
ORing an X or Z value is always an X. Table 5-7 shows the truth table for the bitwise
XOR.
0 0 1 X X
1 l 0 X X
X X X X X
z X X X X
Bitwise XNOR. The bitwise XNOR operator does a Boolean exclusive-NOR of each
bit of the first operand with the corresponding bit in the second operand, working
from right to left. The bitwise XNOR operator is X-pessimistic — the result of exclu-
sive-NORing an X or Z value is X.
156 RTL Modeling with SystemVerilog for Simulation and Synthesis
Table 5-8 shows the truth table for the bitwise XNOR.
0 1 X z
0 l 0 X X
1 0 l X X
X X X X X
z X X X X
Example 5-5: Using bitwise operators: multiplexed N-bit wide AND/XOR operation
// User-defined type definitions
package definitions_pkg;
typedef enum logic {AND_OP, XOR_OP} mode_t;
endpackage: definitions_pkg
always_comb
case (mode)
AND_OP: result = a & b;
XOR_OP: result = a A b;
endcase
endmodule: and xor
Chapter 5: RTL Expression Operators 157
Figure 5-5 shows how the RTL model in Example 5-5 might synthesize. As has
been noted earlier in this chapter, the implementation created by synthesis can be
influenced by a number of factors, including: the target device, any other operators or
programming statements used in conjunction with the operator, the synthesis com
piler utilized, as well as the synthesis options and constraints that were specified.
Figure 5-5: Synthesis result for Example 5-5: Bitwise AND and OR operations
158 RTL Modeling with SystemVerilog for Simulation and Synthesis
Reduction operators perform their operations on all the bits of a single operand and
return a scalar (1-bit) result. Table 5-9 lists the reduction operators.
The reduction operators include a NAND and a NOR operator, which the bitwise
operators do not have. The reduction AND, OR and XOR operators perform their
operation one bit at a time, working from the right-most bit (the least-significant bit)
towards the left-most bit (the most-significant bit). The operations use the same truth
tables as their corresponding bitwise operators, as shown in section 5.4. The reduction
NAND, NOR and XNOR operators first perform a reduction AND, OR or XOR oper
ation, respectively, and then invert the 1-bit result.
The AND, NAND, OR and NOR operators are X-optimistic. For a reduction AND,
if any bit in the operand is 0, the result will be 1 ' bO. For a reduction NAND, if any
bit in the operand is 0, the result will be 1 ' b l . Similarly, for a reduction OR, if any bit
in the operand is 1, the result will be 1 ' b l . For a reduction NOR, if any bit in the
operand is 1, the result will be 1 ' bO. The reduction XOR and XNOR operators are X-
pessimistic. If any bit of the operand is X or Z, the result will be 1 'bx. Table 5-10
shows the result of each reduction operator for a few example values.
A
O p eran d & ~& i H
Example 5-6 illustrates a small RTL model that utilizes reduction operators to
check for correct parity of a data value. Figure 5-6 shows how this RTL model might
synthesize.
Figure 5-6: Synthesis result for Example 5-6: Reduction XOR (parity checker)
160 RTL Modeling with SystemVerilog for Simulation and Synthesis
Logical operators evaluate their operands, and return a value indicating whether the
result of the evaluation is true or false. For example, the operation a && b tests to see
if both a and b are true. If both operands are true, the && operator returns true. Other
wise, the operator returns false.
Logical operator return values. SystemVerilog does not have a built-in true or false
Boolean value. Instead, the return of logical operators use the logic value l'bl (a
one-bit wide logic 1) to represent true, and 1 ' bO to represent false. Logical operators
can also return a 1 ' bx to indicate an ambiguous condition where simulation cannot
determine if actual logic gates would evaluate as a true or false condition.
The logical negate operator is often referred to as the not operator, which is short
for “not true”.
Logical operators perform their operations by first doing a logical OR reduction of
each operand, which yields a 1-bit result. That result is then evaluated to determine if
it is true or false. In the case of the negate operator, the 1-bit result is first inverted,
and then evaluated as true or false.
Chapter 5: RTL Expression Operators 161
Tables 5-12 and 5-13 show the results of these logical operators for a few example
values.
\—i
4 'bOOOO
X
4 'bOOOO 4 ' bOlzx 1 'bO 1 'bl
O p eran d 1 ?
•
4 'bOOOO 1 'bl
4 'bOlzx 1 'bO
A logical operation will return true if any bit of the vector is set, which could lead
to design errors when testing for specific bits. When evaluating vector values, use an
equality or relational operator to test for acceptable values.
Example 5-7 illustrates a small RTL model that uses the negate, logical AND and
logical OR operators. The design is a logical comparator that sets a flag if either of
two data values fall within a configurable range of values.
Chapter 5: RTL Expression Operators 163
Example 5-7: Using logical operators: set flag when values are within a range
module status flag
# (parameter N = 4, // data bus size
logic [N-l:0] MIN = 'h7, // minimum must-have value
logic [N-l:0] MAX = 'hC // maximum must-have value
)
(input logic elk, // elk input
input logic rstN, // active-low async reset
input logic [N-l:0] dl, d2, // scalable input size
output logic in range // set if either dl or d2
); // is within MIN/MAX range
Figure 5-7 shows how the RTL model in Example 5-7 might synthesize.
Figure 5-7: Synthesis result for Example 5-7: Logical operators (in-range compare)
The short circuiting comer case described in the preceding paragraph can be
avoided by not modifying variables that are external to the function. This comer case
can lead to critical mismatches in how simulation behaves and the gate-level imple
mentation from synthesis behaves.
Comparison operators evaluate their operands and return a value indicating whether
the result of the evaluation is true or false. The logic value 1 ' bl (a one-bit wide logic
1) represents tme, and 1 'bO represents false. In simulation, these comparison opera
tors can also return a 1 ' bx to indicate an ambiguous condition where simulation can
not determine if actual logic gates would result in a 1 (tme) or 0 (false).
Chapter 5: RTL Expression Operators 165
Table 5-15 lists the comparison operators. All comparison operators are synthesiz-
able.
== m == n Equality: Is m equal to n?
v= m != n Not Equality: Is m not equal to n?
< m < n Less-than: Is m less than n?
<= m <= n Less-than or equal: Is m less than or equal to n?
> m > n Greater-than: Is m greater than n?
>= m >= n Greater-than or equal: Is m greater than or equal to n?
Pessimistic comparisons. Comparison operators are unique from most other Sys
temVerilog operators in that they are always pessimistic. If either operand has even a
single bit that is X or Z, the operand is considered unknown, and therefore the result
will be unknown. This pessimism at the RTL level is an abstraction from actual logic
gate-level behavior, which would be more optimistic. Consider the following values
with a greater-than operation:
logic [3:0] c, d;
logic gt;
Signed and unsigned comparisons. The comparison operators can perform either
signed or unsigned comparisons. The rule is: if both operands are signed expressions,
166 RTL Modeling with SystemVerilog for Simulation and Synthesis
assign ul = 5; // unsigned 5
assign si = -3 // negative 3
If mixed signed and unsigned comparisons are a requirement of the design, then it
might be desirable to compare absolute values instead of negative values. SystemVer
ilog does not have an operator or built-in function that returns the absolute value of a
negative value. Instead, the absolute value must be calculated by performing a two’s
complement operation. The arithmetic unary subtract operator ( - ) can be used for
this.
A synthesizable function to perform an absolute operation with parameterized bus
widths is:
function [WIDTH-1:0] abs_f (logic signed [WIDTH-1:0] a);
return (a >= 0)? a : -a; // 2's complement negative values
endfunction: abs_f
An example of using this function is:
parameter WIDTH = 8;
logic [WIDTH-1:0] ul;
logic signed [WIDTH-1:0] si;
logic gtl, gt2;
Chapter 5: RTL Expression Operators 167
assign ul = 5; // unsigned 5
assign si = -3; // negative 3
Example 5-8 illustrates a small RTL model that uses the less-than, greater-than and
equality comparison operators. Figure 5-8 shows how this model might synthesize.
Figure 5-8: Synthesis result for Example 5-8: Relational operators (comparator)
The schematic shown in Figure 5-8 is based on generic components, before the syn
thesis compiler has mapped the functionality to a specific target ASIC or FPGA
device. The synthesis compiler used to generate this generic schematic used a generic
greater-than comparator twice, but the top instance has the a and b inputs reversed.
The manner in which this generic functionality is mapped to actual components will
depend on the types of components available in a specific target technology.
The === and ! == case equality operators are similar in usage to the = and ! = com
parison equality operators, but with an important functional difference. The case
equality operators perform their operation by comparing each bit of the two operands
for all 4 possible logic values, 0, 1, Z and X, whereas the comparison equality opera
tors only compare for values of 0 and 1 in each bit of the operands.
NOTE
Some RTL synthesis compilers do not support the === and ! == case equality
operators at all. Other RTL synthesizers support these operators, but restrict
the usage to expressions that do not involve X or Z values.
The equality operators are supported by all RTL synthesis compilers. The case
equality operators are not universally supported. The case equality operators should
only be used in testbench code that is not intended to be synthesized.
The ==? and ! =? wildcard case equality operators are synthesizable. These opera
tors compare the bits of two values, with ability to mask out specific bits from the
comparison. Bits are masked out by specifying an X, Z or ? for the masked bits in the
second operand. The mask acts like a wildcard because the corresponding bit in the
first operand can be any value since it is masked from the comparison. In Example 5-
9, the comparison is only made on the upper 8 bits of a 16 bit word. The lower 8 bits
are ignored, and could therefore be any value.
170 RTL Modeling with SystemVerilog for Simulation and Synthesis
Example 5-9: Using case equality operators: a comparator for high address range
//
// Set high_addr flag if all bits of upper byte of address
// are set
//
module high_address_check
(input logic elk, // clock input
input logic rstN, // active-low async reset
input logic [31:0] address, // 32-bit input
output logic high_addr // set high-byte all ones
);
The mask bits in Example 5-9 could have been represented as 32' hFFxxxxxx,
32 'hFFzzzzzz, 32 'hFF??????, or any combination of X, Z or ?, and using either
lowercase or uppercase characters for the X and Z. While all these variations are
functionally identical, the use of a question mark as a wildcard can make the code
more understandable and self-documenting. The letters X and Z can be wildcards in
some contexts, and literal values in other contexts. The dual usage of these letters can
make code less intuitive when used as a wildcard instead of as a logic value.
The ==? and ! =? wildcard case equality operators are treated by synthesis compil
ers in the same manner as the == and ! = equality operators, but with the masked bits
left out of the comparator. Figure 5-9 shows how Example 5-9 might synthesize.
Figure 5-9: Synthesis result for Example 5-9: Case equality, ==? (comparator)
bit is a wildcard — the corresponding bit position in the first operand could be any
value, including 4-state values.
always_comb begin
pattern_flag = data inside {8'b??1010??};
end // true if the middle bits of data match 1010
Example 5-10 illustrates a small RTL model that uses the inside operator.
Example 5-10: Using the set membership operator: a decoder for specific addresses
77
// Decoder that sets a flag whenever address is on a quadrant
// boundary of the address range
//
module boundary_detector
# (parameter N = 16)
(input logic [N—1:0] address, // address bus
output logic boundary_flag // set when address is at
); / / a quadrant boundary
always_comb begin
boundary_flag = (address inside {(0), // quad 1
(((2**N)/4)*1), // quad 2
(((2**N)/4)*2), // quad 3
(((2**N)/4)*3) // quad 4
} );
end
endmodule: boundary_detector
The inside operator is versatile, and can represent a variety of gate-level compari
son circuits. Figure 5-10 shows how the model in Example 5-10 synthesized.
Figure 5-10: Synthesis result for Example 5-10: Inside operator (boundary detector)
Chapter 5: RTL Expression Operators 173
Additional uses of the inside operator. The inside set membership operator
allows the members of the list to be expressions that can change during simulation.
Some other ways this operator can be used include:
1. The list of values can be expressions, such as other variables or nets.
always_comb begin
data_matches = data inside {a, b, c};
end // true if data matches the current value of a, b or c
NOTE
At the time this book was written, some RTL synthesis compilers did not
fully support the inside operator. Make sure that all tools in the design
flow support the ways the inside operator is being used in a project.
Shift operators shift the bits of a vector right or left a specified number of times.
SystemVerilog has both bitwise and arithmetic shift operators, listed in Table 5-18.
A bitwise shift simply moves the bits of a vector right or left the specified number
of times. The bits that are shifted out of the vector are lost. The new bits that are
shifted in are zero filled. For example, the operation 8'bllOOOlOl << 2 will result
in the value 8 'bOOOlOlOO. A bitwise shift will preform the same operation, regard
less of whether the value being shifted is signed or unsigned.
An arithmetic left shift performs the same operation as a bitwise right shift on both
signed and unsigned expressions. An arithmetic right shift performs a different opera
tion on unsigned and signed expressions. If the expression being shifted is unsigned,
an arithmetic right shift behaves the same as a bitwise right shift, which is to fill the
incoming bits with zero. If the expression is signed, an arithmetic right shift will
maintain the signedness of the value by filling each incoming bit with the value of the
sign bit.
Figure 5-11 shows how these shift operations move the bits of a vector by 2 bits.
1 1L 0 C 0 1 0 1
o
1 1 0 1
00 li 000 1 0 c) 1 3. 0 0 0 1
« < arithmetic left shift » > arithmetic right shift » > arithmetic right shift
1 10 0 0 10 1 11000101
m
0 00 10 100 1 1 1 100 0 1
Shifting a fixed number of times. A shift operation for a fixed number of times
simply rewires the bits of a bus, with the incoming bits tied to ground. No logic gates
are required to implement a fixed shift. Example 5-11 illustrates a simple divide-by-
two combinational logic model, where the division is performed by shifting an 8-bit
bus right by one bit.
Chapter 5: RTL Expression Operators 175
Example 5-11: Using the shift operator: divide-by-two by shifting right one bit
77
// Divide-by-two operation by shifting an N-bit bus right by
// one bit. Fractional results are rounded down.
//
module divide_by_two
# (parameter N = 8)
(input logic [N—1:0] data_in, // N-bit input
output logic [N—1:0] data_out // N-bit output
);
Figure 5-12 shows how this shift right for a fixed number of bits might synthesize.
The synthesis compiler placed buffers on the inputs and outputs of the module, but
did not utilize any additional gates to perform the operation.
Figure 5-12: Synthesis result for Example 5-11: Shift operator, right-shift by 1 bit
A shift for a fixed number of times can also be represented using a concatenate
operation. The following two lines of code are functionally identical.
Example 5-12: Using the shift operator: multiply by a power of two by shifting left
77
// Multiply by a power of two operation by shifting an N-bit
// bus left by a variable number of times; no overflow.
//
module multiply_by_power_of_two
# (parameter N = 8)
(input logic [N—1:0] data_in, // N-bit input
input logic [$clog2(N)-1:0] base2exp, // ceiling log2 of N
output logic [N—1:0] data_out // N-bit output
);
endmodule: multiply_by_power_of_two
The $clog2 system function in this example is used to calculate the width of the
base2exp input port. This function returns the ceiling (fractional values are rounded
up to the next whole number) of the log2 of a value. The function is a convenient way
to calculate how many bits are required to represent a value.
Figure 5-13 illustrates how this model might synthesize. The schematic is the inter
mediate synthesis result, before the shift functionality has been mapped and opti
mized to a specific device. A generic “shift left logical” component represents the
unmapped shift operation.
Figure 5-13: Synthesis result for Example 5-12: Shift operator, variable left shifts
The generic shift-left component in the synthesis results has the same number of
bits for both of its inputs. The unused upper bits for the base2exp input are tied to
ground. These unused bits might be removed when synthesis maps the generic shift-
left component to a specific target implementation.
The shift operator can be used to multiply or divide by values other than a power of
2. The following example shifts a vector 7 times.
Let synthesis do its job! Synthesis allows engineers to design at an abstract level,
focusing on functionality without getting bogged down in implementation details, and
without having to be overly concerned about the features of a specific ASIC or
FPGA. The synthesis compiler translates the abstract functional model to an efficient
implementation for a target ASIC or FPGA. While it is possible to model barrel shift
behavior at a more detailed level, there is generally no advantage in doing so. Modem
synthesis compilers recognize barrel-shift behavior in an abstract RTL model using
the shift operator, and will produce an optimal implementation of this functionality in
the target device. This implementation might vary for different target devices,
depending on what standard cells, LUTs, or gate-arrays are available in that device.
The rotation works by first concatenating the expression to be rotated to itself, and
then shifting the result the desired number of times. The shift operation causes the bits
of the concatenated vector to be shifted over, with the effect of rotating the bits of one
end of the original value shifting into the bit positions at the other end of the original
value. After the shift operation, the half of the concatenated vector containing the
rotation result is selected. For a shift-right operation, the right half contains the
desired result. For a shift-left operation, the left half contains the desired result.
Figure 5-14 illustrates how the concatenate-then-shift operation moves the bits of a
value. The illustration rotates a value two times, but the operation will work for rotat
ing a value up to N times, where N is the number of bits in the original value. (Rotat
ing more than N times causes zeros to shift into the original bit positions, and no
longer behaves like a rotate operation).
Figure 5-14: Rotate a variable number of times using concatenate and shift operators
Rotating a value left or right by 2 bits
For the value: m = 8 ' b l l 0 0 0 1 0 1
concatenate-then shift right 2 times concatenate-then shift left 2 times
temp = { m ,m) » 2 temp = {m,m} « 2
Example 5-13: Performing a rotate operation using concatenate and shift operators
//
// Rotate an input vector left the number of times specified
// by a rotation factor input.
//
module rotate_left_rfactor_times
# (parameter N = 8)
(input logic [N—1:0] data_in, // N-bit input
input logic [$clog2(N)-1:0] rfactor, // ceiling log2 of N
output logic [N—1:0] data_out // N-bit output
);
logic [N*2-l:0] temp;
Figure 5-15: Synthesis result for Example 5-13: Concatenate and shift (rotate)
180 RTL Modeling with SystemVerilog for Simulation and Synthesis
Rotate right operation shortcut. A rotate right operation can be performed without
a temporary variable by taking advantage of SystemVerilog’s assignment statement
rules.
logic [ 7:0] in, out;
logic [ 2:0] rfactor;
The cast operator is discussed in more detail in section 5.15 (page 198).
Synthesis compilers do not support loops that executes for a variable number of
times. Synthesis requires that the number of times a loop will iterate be a static a
value that is available during compilation. Synthesizing static and data-dependent
loop iterations is discussed in Chapter 6, section 6.3.1.1 (page 230).
Chapter 5: RTL Expression Operators 181
Table 5-19: Streaming operators (pack and unpack) for RTL modeling
assign a = 8'bl1000101;
always_comb begin
b = { << {a}}; // sets b to 8'blOlOOOll (bit reverse of a)
end
182 RTL Modeling with SystemVerilog for Simulation and Synthesis
assign in = 32'hAABBCCDD;
always_comb begin
{>>8{a }} = in; // sets a[0]=AA, a[l]=BB, a[2]=CC, a[3]=DD
end
NOTE
At the time this book was written, some synthesis compilers did not support
the streaming operators, and some synthesis compilers only supported
streaming when the default block size of 1-bit was used, as in the bit reversal
code snippet example above.
Before using the streaming operators in RTL models, engineers should make
sure the operators are supported by all software tools used in a design flow.
Example 5-14: Using the streaming operator: reverse bits of a parameterized vector
module reverse_bits
# (parameter N = 8)
(input logic [N—1:0] data_in, // N-bit input
output logic [N—1:0] data out // N-bit output
);
Figure 5-16: Synthesis result for Example 5-14: Streaming operator (bit reversal)
The synthesis compiler shows buffers to map each bit number of data_in to a dif
ferent bit in data_out. These buffers will probably be removed when the technology
independent netlist is mapped to a specific ASIC or FPGA target device.
184 RTL Modeling with SystemVerilog for Simulation and Synthesis
All SystemVerilog arithmetic operators are synthesizable, but specific ASICs and
FPGAs might have restrictions on what can be implemented at the gate-level in that
device. Operations such as multiply, divide, modulus, and power are complex circuits
in hardware, and can require a substantial amount of logic gates and propagation tim
ing paths.
When writing RTL models for synthesis, it is important for the design engineer to
remember that the final purpose of RTL code is not to be a software program that will
run on a general purpose computer. The objective of RTL models is to be an abstract
representation of digital logic gates. The simple code:
always_ff @(posedge elk)
out <= a / b;
is asking the synthesis compiler to create a gate-level divider that reaches a com
pleted result every clock cycle, with no intermediate pipelined stages. Whether this
will be possible depends on a number of factors, such as the design’s clock speed, the
widths of the numerator and denominator vectors, and the capabilities of the target
ASIC or FPGA device.
NOTE
The capabilities and limitations of each specific ASIC or FPGA device can
vary widely. RTL models that use the multiply, divide, modulus and power
operators should be written to match the capabilities of the target device.
Chapter 5: RTL Expression Operators 185
(c) For multiplication and division when both operands are non-constant val
ues, use smaller vector sizes, such as 8-bits.
Adhering to these guidelines will help to ensure that an RTL models can be synthe
sized to most target ASIC and FPGA devices.
Design engineers need to take extra steps when writing RTL models where the
design specification requires operations that are outside of these suggested guidelines.
It might be necessary model a pipelined data path to break an operation into multiple
clock cycles. Some synthesis compilers have the ability to do register retiming, an
automated process of moving combinational logic in a pipeline to different stages of
the pipeline in order to achieve faster clock speeds. Another design technique is to use
a Finite State Machine to break a complex arithmetic operation into multiple clock
cycles.
Many target ASIC and FPGA devices have predefined gate-level arithmetic blocks
or Intellectual Property (IP) models for complex arithmetic operations. These compo
nents can be used in place of a System Verilog arithmetic operator. The use of built-in
gate-level arithmetic blocks or IP models can be very effective for achieving best
Quality of Results (QoR) in an implementation. The trade-off is that the design mod
els can become locked to a specific target ASIC or FPGA family. Rewriting, and
reverifying, some of the RTL models might be necessary to change to a different tar
get device.
High-level Synthesis (HLS) tools can also be used to map abstract complex opera
tions into either RTL models or directly into logic gates. Where Register Transfer
Level (RTL) modeling requires that the design engineer specify exactly what opera
tions need to be done in each clock cycle, High-level Synthesis allows specifying that
an operations has to be completed within a specific number of clock cycles. The syn
thesis compiler then determines how to best implement that requirement. High-level
Synthesis is outside the scope of this book, which is on best coding practices for writ
ing RTL models.
186 RTL Modeling with SystemVerilog for Simulation and Synthesis
endmodule: unsigned_adder
Chapter 5: RTL Expression Operators 187
endmodule: signed_adder
Figures 5-17 shows the Synthesis result for the unsigned adder in Example 5-15,
and Figure 5-18 shows the Synthesis result for the signed adder in Example 5-16
Figure 5-17: Synthesis result for Example 5-15: Arithmetic operation, unsigned
Figure 5-18: Synthesis result for Example 5-16: Arithmetic operation, signed
188 RTL Modeling with SystemVerilog for Simulation and Synthesis
Observe that the Synthesis result for the unsigned adder shown in Figure 5-17 is
identical to the Synthesis result for the signed adder shown in Figure 5-18. The reason
for this is discussed in section 5.12.2 (page 188) of this chapter.
The synthesis compiler used to generate the implementations shown in Figures 5-
17 and 5-18 mapped the RTL adder functionality to a generic adder block that has
unused carry-in and carry-out bits. The next step in synthesis would be to target a spe
cific ASIC or FPGA device. The generic adder would be mapped to a specific adder
implementation during that step. The device-specific adder might not have these
carry-in and carry-out ports, depending on the components available in that specific
device.
NO TE
Example 5-18 is not synthesizable, and is shown here to illustrate how data
types can affect operations. RTL synthesis compilers typically do not sup
port real (floating-point) expressions. High-level Synthesis (HLS) tools can
be used for complex arithmetic design. Floating point and fixed point design
is outside the scope of this book.
5.12.2 Unsigned and signed arithmetic might synthesize to the same gates
In simulation, an unsigned adder treats a negative input value as a large positive
value. This is because negative values are represent in two’s-complement form. The
most significant bit (bit 7 for the 8-bit vector examples below) becomes the sign bit,
which, when set, indicates that the value is negative. When a negative value with the
sign bit set is treated as an unsigned value, the sign bit loses its meaning. That value
with its most significant bit set become a large positive value.
For the unsigned adder modeled in Example 5-15, the following inputs values,
shown in decimal and binary, produce unsigned values for the outputs:
a=l (00000001), b=l (00000001): sum = 2 (00000010)
a=l (00000001), b=255 (11111111): sum = 0 (00000000)
a=l (00000001), b=-3 (11111101): sum = 254 (11111110)
a=-l (11111111), b=-3 (11111101): sum = 252 (11111100)
When the same input values are applied to the signed adder modeled in Example 5-
16, the decimal results are different.
a=l (00000001), b= 1 (00000001): sum = 2 (00000010)
a=l (00000001), b=-1 (11111111): sum = 0 (00000000)
a=l (00000001), b=-3 (11111101): sum = -2 (11111110)
a=-l (11111111), b= -3 (11111101): sum = -4 (11111100)
Chapter 5: RTL Expression Operators 189
Observe that, in decimal, the results of the signed and unsigned adders are different.
This is because the decimal radix interprets the most significant bit of sum as a sign
bit. In binary, however, the output values for the unsigned adder and the signed adder
are identical. The difference between signed and unsigned operations is not the binary
result, it is how the most-significant bit of that result is interpreted. With unsigned
types, the most-significant bit is simply part of the value. With signed types, the most-
significant bit is a flag, indicating the value is negative.
This similarity in how unsigned and signed types synthesize is true for add, sub
tract, and multiply operations, but is not true for divide operations. The binary result
for divide operations can be different for signed and unsigned operations because
divide operations can have fractional results. For example, a signed divide operation
of 1 / -1 will result in -1, whereas an unsigned divide operation will result in 0. The
reason is that -1 as an unsigned value is 255, so the unsigned operation is actually
1 / 255, which is a fractional result that cannot be represented as an integer.
Declaring module ports and internal variables as the logic type will infer an
unsigned net type for input and inout ports, and an unsigned variable for output ports.
—
--n pre-decrement the value of n by 1, or post-decrement n
n-- by 1
The operand for the ++ and — operators must be a vector variable with a size of 1
or more bits. The floating-point types of real or shortreal can also be used, but
these types are not supported by most synthesis compilers.
In a pre-increment operation, the value of the operand is first incremented by 1, and
a new value is returned from the operation. For example, in the statement:
190 RTL Modeling with SystemVerilog for Simulation and Synthesis
n = 5;
y = ++n; // y=6, n=6
The current value of n is first increment, and the result is assigned to y. Thus, after
the statement is executed, y has the value of 6 and n has the value of 6.
In a post-increment operation, the current value of the operand is first returned, then
the operand is incremented by 1. In the statement:
n = 5;
y = n++; // y=5, n=6
The current value of n is assigned to y, nd then n is incremented. Thus, after the
statement is executed, y has the value of 5 and n has the value of 6.
These same rules apply to the —decrement operator, except that the operand is dec
remented by 1.
n = 5;
y = — n; // y=4, n=4
y = n— ; // y=4, n=3
NO TE
The following code snippet illustrates a proper usage of the increment and decre
ment operators in a combinational logic model:
logic [15:0] data_bus;
logic [ 3:0] count_ones;
always_comb begin
count_ones = '0;
for (int i=15; i>=0; i--)
if (data_bus[i]) count_ones++;
end
This next code fragment shows an improper usage of an increment operator:
parameter MAX = 12;
logic [7:0] count, data, q;
The proper place for using blocking assignment behavior is when representing
combinational logic. Using the increment and decrement operators to model sequen
tial logic, such as counters, will cause simulation race conditions. Nonblocking
assignments are required in order to avoid simulation race conditions in sequential
logic procedures.
The most common usage of the increment and decrement operators is with for
loop control variables, as in the following example.
192 RTL Modeling with SystemVerilog for Simulation and Synthesis
5.13.2 A n exam ple o f correct usage o f increm ent and decrem ent operators
Example 5-18 below is similar to the count_ones code snippet shown earlier in
this section. This more complete example uses a parameter for the data_bus size so
that the model can be scaled to different bus widths. An 8-bit bus size is used in order
to keep the synthesized schematic size small for the page size of this book. Figure 5-
19 shows the resulting synthesis schematic for this example.
always_comb begin
count = '0;
for (int i=N-l; i>=0; i--)
if (data_bus[i]) count++;
end
endmodule: count ones
Chapter 5: RTL Expression Operators 193
Figure 5-19: Synthesis result for Example 5-18: Increment and decrement operators
The synthesis compiler used to generate the implementation shown in Figure 5-19
mapped the RTL bit counter functionality to a series of generic adders. The adders
represent the ++ operator that increments the count variable.
The decrement operator that is part of the for loop does not appear in the synthe
sized results. This is because the for loop in the RTL model is unrolled, to create
adders for each pass of the loop. The generic adder has a carry-in input, and therefore
can add up to 3 bits of the data_bus. For an 8-bit data_bus, 3 of these generic
adders are instantiated from the for loop, and a 4th adder is used to sum the results of
these 3 adders.
The next step in synthesis is to target a specific ASIC or FPGA device. The generic
adders will be mapped to a specific implementation. The mapping process might per
form further optimizations based on the adder types available in the target device.
Figure 5-20 shows the results of synthesis targeting the generic increment adders to
a Xilinx Virtex®-7 FPGA. The adders were replaced by functionality programmed
into the device’s LUTs (Look Up Tables). Each LUT contains a number of basic logic
gates. The gates, and connections between them, can be programmed to implement
specific functionality, such as the series of increment operation.
Figure 5-21 shows the results of synthesis targeting the generic increment adders to
a Xilinx CoolRunner™-II FPGA. In this device, the increment functionality was
mapped to discrete AND, OR and inverter gates. Again, only a portion of the sche
matic is shown, in order to focus on how the incrementers were implemented.
SystemVerilog has operator precedence rules to define the order in which multiple
operations are performed. These rules are discussed in more detail in section 5.16.
NOTE
The increment/decrement operator has the same precedence as several other
arithmetic operators. The order in which operations can be evaluated is
ambiguous in a compound expression that uses increment/decrement in
combination with other arithmetic operators.*•
When the increment or decrement operator is used in conjunction with other arith
metic operators that have the same evaluation precedence, the simulator can evaluate
the operators in any order. For example:
n = 5;
y = n + ++n; // y could be assigned 11 or 12
In this code snippet, a simulator could either:
• Evaluate the + operator first, and then the ++ operator. In this case, simulation will
use the current value of n, which is 5, plus the return of the pre-increment ++ oper
ation, which is 6. The result of the compound operations is 11 (5 + 6).
Chapter 5: RTL Expression Operators 195
• Evaluate the ++ operator first, and then the + operator. In this case, simulation will
use the new value of n, which is 6, plus the return of the pre-increment ++ opera
tion, which is 6. The result of the compound operations is 12 (6 + 6).
This ambiguity of evaluation order can lead to different results in different simula
tors, or, even more dangerous, a difference in the verified RTL simulation and the
gate-level implementation from synthesis.
3
&
NOTE
The assignment operators use blocking assignment behavior.
put of a sequential logic block can result in simulation race conditions, which can lead
to the RTL model behavior not matching the synthesized gate-level behavior.
Example 5-19 illustrates using assignment operations in a simple combinational
logic block. Figure 5-22 shows the synthesis output for this example.
module bitwise_unit
import bitwise_types_pkg::*;
# (parameter N = 8)
(input logic [N—1:0] a, b,
input op_t opcode,
output logic [N—1:0] result
);
always_comb begin
result = a; // transfer a input to result output
case (opcode) // modify result based on opcode
AND _OP: result &= b;
°R_OP : result 1= b;
XOR_OP: result b; A ___
make code more understandable. The author feels that these assignment operators do
not meet any of these objectives when used in a synthesizable RTL context.
Observe that in the RTL model shown in Example 5-19, an intermediate assign
ment to result had to made before the case statement, in order to use result as the
first operand of the assignment operators in the case statement. This extra line of code
is not needed when using regular assignments statements and operators. The follow
ing code snippet shows how the code in Example 5-19 can be modeled more con
cisely, and easier to read, when the assignment operators are not used.
always_comb begin
case (opcode)
AND _OP: result = a & b;
0R_ OP : result = a 1 b;
XOR_OP: result = a A b;
RSI OP: result = a » l;
endcase
end
SystemVerilog provides a cast operator that allows explicitly changing the type,
size or signedness of on expression. The three forms of the cast operator are listed in
Table 5-23.
For those familiar with the C language, it should be noted that the syntax for type
casting is different than C. SystemVerilog uses the format <type> ' [<expression> ) ,
whereas C uses the format (<type>) <expression> . The different syntax is necessary
to maintain backward compatibility with how the original Verilog languages uses
parentheses, and to provide the additional casting capabilities of size and sign casting
that are not in C.
SystemVerilog is a loosely typed language, meaning an implicit conversion auto
matically happens when an expression of one type or size is assigned to an expression
of a different type or size. Some simple examples of these loosely typed conversion
are:
Chapter 5: RTL Expression Operators 199
initial begin
u32 = ul 6; //
ul6 = u32 ; //
s32 = u32 ; //
u32 = s32 ; //
r 64 = s32 ; //
s32 = r64; //
end
An implicit type or size conversion can also occur as context-dependent operations
are evaluated, as discussed in section 5.1.3 (page 144). In the statement:
assign u32 = u32 + ul6; // 32-bit add operation
Arithmetic operators, such as +, require that both operands be the same type and
size. All operands will be expanded to the largest vector size before the operation is
performed. Therefore, in the operation u32 + ul6, ul6 will first be converted to a
32-bit size by left-extending its value.
Another implicit conversion that can automatically occur is from a signed value to
an unsigned value, or vice versa. In the statement:
assign s32 = s32 < u32; // 32-bit unsigned comparison
The signed s32 value will be implicitly converted to an unsigned value. Context-
dependent operations (see section 5.1.3, page 144) only perform a signed operation if
both operands are signed. If one of the operands is unsigned, an unsigned operation is
performed. In the less-than comparison ofs32 < u32, s32 is first converted to an
unsigned value because u32 is unsigned.
These conversion rules are defined in the IEEE 1800 SystemVerilog standard, so all
software tools that use System Verilog, including simulators and synthesis compilers,
perform the same conversions. The conversion rules that most often occur in RTL
modeling are discussed in section 5.1.3 (page 144) of this chapter. Refer to the IEEE
standard for a full description of all possible conversions that can occur when an
expression of one type or size is assigned to an expression of another data type or
size.
Observe that data can be lost in some of these loosely-typed conversions. In the
assignment ul6 = u32;, the left-most 16 bits are truncated. The value that was in
those upper two bytes of u32 are lost. In the assignment: s32 = r64;, the double
precision floating value is rounded off. Any precision of the decimal accuracy is lost,
and any value greater than what a 32-bit integer can store is lost.
200 RTL Modeling with SystemVerilog for Simulation and Synthesis
SystemVerilog will perform implicit type casting when: a) an operation has a mix
of operand types, or b) an expression of one type is assigned to an expression of
another type. For the most part, this implicit casting will do the right thing, and will
synthesize to the desired gate-level functionality. One purpose for the use of type cast
ing in RTL modeling is to either make the implicit type conversion more obvious by
doing an explicit type cast, or to do something different than the implicit conversion
rules. A second purpose for using type casting in RTL models is when assigning val
ues to enumerated variables, which do not have implicit conversion rules the way
other SystemVerilog data types have.
instruction_t instruction;
opcode_t opcode;
logic [15:0] data;
always_comb begin
case (instruction)
ARITHMETIC : opcode = data[2:0]; // illegal assignment
BRANCH : opcode = NOP;
endcase
end
A type cast can be used to explicitly convert the 3-bit value from data to the enu
merated type, making this assignment legal.
Mixed integer and floating-point operations. The following code snippet shows a
compound operation with a mix of integer and floating-point types.
parameter PI = 3.14159;
logic [31:0] a, b, result;
The implicit conversion of an expression of one vector size to another vector size is
widely used in RTL modeling. Perhaps one of the most common places an implicit
size conversion occurs is with the literal values of 0 and 1. For example:
logic [7:0] count;
The variable in is 8-bits wide. The result of the concatenation {in, in } is a 16-bit
value, which is being assigned to the 8-bit variable out. The upper 8 bits of the con
catenation and shift result are implicitly truncated during the assignment, and so only
the lower 8 bits are transferred to out. This is the desired affect of rotating right a
variable number of times. The code takes advantage of the implicit size conversion
defined in the SystemVerilog language.
Functionally incorrect implicit size casting. The following code snippet, a variable
rotate-left operation, illustrates a design error resulting from an assignment mismatch
and the implicit size truncation that occurs.
Warning Rhs width '16' with shift (Expr: ' ({in ,in} << rfac
tor) ') is more than lhs width '8' (Expr: 'out'), this may
cause overflow
*Partial output report generated by Synopsys Spyglass LintR RTL style checker.
As a rotate-left, the implicit truncation is a design bug, and this lint checker warn
ing helps the designer recognize that there is a problem in the code. A correct, synthe-
sizable way to model this variable rotate-left operation is to use an intermediate 16-bit
variable to store the concatenate and shift result, as shown earlier in this chapter in
Example 5-13 (page 179).
Implicit size conversion warning messages. The IEEE 1800 SystemVerilog stan
dard does not require assignment size mismatch warnings. Most SystemVerilog simu
lators and synthesis compilers do not generate these warnings, assuming — and
trusting — that the design engineer deliberately intended to have a mismatch in the
assignment sizes.
On the other hand, lint checkers (tools that verify that code adheres to RTL model
ing guidelines) will generate warnings when the expressions sizes on the left-hand
and right-hand side of an assignment to not match. These truncation size mismatch
warnings can be useful if there is an error in the code and the designer’s intent is to
have the same vector size on both sides of an assignment.
The size mismatch in the left-rotate example above is a design mistake. The mis
match warning message generated by a lint checker (but probably not by simulators
or synthesis compilers) is a desirable warning, that can find and prevent design bugs.
The size mismatch in the rotate and counter examples shown earlier in this section,
however, are false warnings. The implicit size conversion is functionally correct in
both simulation and synthesis. A lint warning is neither warranted nor wanted. Engi
204 RTL Modeling with SystemVerilog for Simulation and Synthesis
neering time can be lost analyzing a false warning to determine that the truncation is
OK in this circumstance. These false warnings need to be ignored, which adds a risk
of then accidentally ignoring other size mismatch warnings that might have indicated
a design error. In a larger design, there can be hundreds of false warnings regarding
implicit size and type conversions. These false warnings can hide a warning message
for an incorrect or undesired size or type mismatch. This is a serious problem!
Using size casting to prevent false size mismatch warnings. The cast operator can
help to make code more self-documenting and intuitive, as well as eliminating false
warning messages. The variable rotate-right operation can be coded to explicitly show
that the result of the concatenate/shift operation is to be 8 bits wide.
assign out = 8'({in,in} >> rfactor); //variable rotate right
Size casting follows the same rules as assignment statements. If an expression is
cast to a smaller size than the number of bits in the expression, the left-most bits of
the expression are truncated. If the expression is cast to a larger vector size, then the
expression is left-extended. An unsigned expression is left-extended with 0; a signed
expression is left-extended using sign extension. (The example above is an unsigned
expression, because the result of a concatenation is always unsigned.)
The size specified with the cast operator can be a run-time constant, which allows
for parameterized modules to scale appropriately when parameter values are rede
fined. For example:
parameter N = 8;
logic [N—1:0] in; // N-bit vector
logic [$clog2(N):0] rfactor; // calculate max rotate size
logic [N—1:0] out; // N-bit vector
Example 5-20 shows the full code for a variable rotate-right operations that uses
size casting to eliminate false lint warnings. Figure 5-23 shows the results of synthe
sizing this example.
Chapter 5: RTL Expression Operators 205
Example 5-20 will simulate and synthesize correctly, with or without the size cast
ing. The purposes of the size casting are to: (1) make the code more self-documenting
that only N bits of the concatenate result are being used, and (2) perhaps more impor
tantly, to eliminate problematic false warnings from RTL lint checkers.
206 RTL Modeling with SystemVerilog for Simulation and Synthesis
Mixed operand signedness . The implicit signedness conversion rule for when the
operands in a context-dependent operation have mixed signedness, is that the signed
expression will be converted to an unsigned value.
The following code snippet shows a less-than relational operation with a signed
operand and an unsigned operand.
logic [7:0] ul; // 8-bit unsigned variable
logic signed [7:0] si; // 8-bit signed variables
initial begin
si = -5;
ul = 1;
if (si < ul)
$display("%0d is less than %0d", si, ul) ;
else
$display("%0d is equal or greater than %0d", si, ul) ;
end
When simulated, this code snippet will display the message:
-5 is equal or greater than 1
It might seem that -5 is less than 1, and yet this code evaluates -5 as being greater
than 1. This happens because SystemVerilog’s implicit type conversion will change
the si value to unsigned, so as to match the unsigned type of ul. The result of this
conversion is that the value of -5 becomes 251. (The twos-complement of an 8-bit -5
is 1111011 in binary, which, when treated as an unsigned value, is 251 in decimal.)
Both operands of a context-determined operation must be signed in order for the
operation to be signed. Signedness casting provides a means to specify that a data
type conversion should occur at any point during the evaluation of an expression. The
following code snippet uses casting to explicitly convert ul to a signed expression.
The operation will now correctly evaluate that -5 is less than 1.
if (si < signed'(ul))
$display("%0d is less than %0d", si, ul);
else
$display("%0d is equal or greater than %0d", si, ul) ;
Conversely, this example can be explicitly coded as an unsigned comparator my
casting s1 to an unsigned expression.
Chapter 5: RTL Expression Operators 207
Example 5-21: Using sign casting for a mixed signed and unsigned comparator
77
// Set It, eq and gt flags based on if s is less-than, equal-to
// or greater-than u, respectively
//
module signed_comparator
# (parameter N = 8) // data size
(input logic elk, // clock input
input logic rstN, // active-low async reset
input logic signed [N-1:0] s, // scalable input size
input logic [N-l:0] u, // scalable input size
output logic It, // set if s less than u
output logic eq, // set if s equal to u
output logic gt // set if s greater than u
);
The schematic shown Figure 5-24 is based on generic components, before the syn
thesis compiler has mapped the functionality to a specific target ASIC or FPGA
device. Example 5-8 (page 167) earlier in this chapter showed an unsigned version of
this same comparator. Comparing Figure 5-8 (page 168) and Figure 5-24 shows that
synthesis mapped the unsigned and signed versions to the same generic components.
The gate-level implementations are simply comparing the bits that are set in two vec
tors. It does not actually matter if those vectors are considered signed or unsigned, so
long as both vectors are the same signedness. For a signed comparator with operands
of mixed signedness, the cast operator ensures that both operands are treated as
signed values during the comparison.
Chapter 5: RTL Expression Operators 209
Operator Precedence
0 [] :: . highest
+ ! & | ~| A ~A A~ ++ (unary ops)
**
* / %
+ - (binary operators)
<< >> «< »>
< <= > >= in s id e d is t
•O
•o
II
II
II
II
II
II
II
II
II
= =
•
-> <->
+= -= *= /= %= &= A= | =
<< == >>= <<<= >>>= : = : / <= (assignment operators)
{} {{}} lowest
Operators on the same row have the same precedence. With three exceptions, mul
tiple operators that have the same precedence are evaluated from left to right (referred
to as operator associativity).
210 RTL Modeling with SystemVerilog for Simulation and Synthesis
In the following example, a is first added to b, and then c is subtracted from the
result of a + b.
assign sum = a + b + c;
The three exceptions to a left-to-right associativity are the conditional ( ? : ), impli
cation ( -> ), and equivalence ( <-> ) operators. These operators are evaluated from
right to left.
The evaluation order of operations can be explicitly controlled using parentheses.
In the following statement, the divide operator has a higher precedence than the add
operator, so the normal evaluation order would be to evaluate the b ** 2 power
operation first, and then add that result to a.
assign out = a + b**2;
This implicit evaluation order based on operator precedence and associativity can
be changed by using parentheses. In this next snippet, (a + b) will be evaluated
first, and that result will be raised to the power of 2.
assign out = (a + b)**2;
5.17 Summary
This chapter has also elaborated on SystemVerilog’s loosely typed value conversion
rules for when an operation involves different data types. SystemVerilog ensures that
the operands of operations are converted to a common type and vector size before
performing operations. These implicit conversions occur automatically. The System
Verilog RTL guidelines and implicit conversions discussed in this chapter will gener
ally ensure that the RTL code will synthesize into a proper gate-level implementation.
This is because SystemVerilog is a Hardware Description Language, and not a soft
ware programming language. However, the implicit conversions are not always obvi
ous, and occasionally an engineer might want to do something different than the
implicit conversions. This chapter has shown how the cast operator can be used to
both document conversions and cause specific type, size or signedness conversions to
explicitly happen.
* * *
211
Chapter 6
RTL Programming Statements
Abstract — Programming statements, such as if-else decisions and for-loops are used
to model hardware behavior and at abstract level, without the complexity and details
of logic gates, propagation delays, setup times, and connectivity. This chapter dis
cusses the SystemVerilog programming statements that are appropriate for RTL mod
eling. Important best coding practices for simulation and synthesis are emphasized.
The topics presented in this chapter include:
• General purpose always procedural block and sensitivity lists
• Specialized always_f f,always_comb and always_latch procedural blocks
• Procedural begin...end statement groups
• Decision statements
• Looping statements
• Jump statements
• No-op statement
• Tasks and functions
Always procedures are infinite loops. They execute their programming statements,
and, upon completion, automatically start over again. The general concept is that
when power is on, hardware is always doing something. This continuous behavior is
modeled using always procedures.
SystemVerilog has four types of always procedures: a general purpose procedure
using the keyword always, and specialized always procedures that use the keywords
always_f f, always_comb and always_latch.
General purpose always procedures. The always procedural block can be used to
model many types of functionality, including synthesizable RTL models, abstract
behavioral models such as RAMs that will not be synthesized, and verification code
such as clock oscillators or continuous response checkers. While the flexibility of the
general purpose always procedure makes it useful in a wide variety of modeling and
verification projects, that same flexibility means that software tools do not know
when the intended usage of always is for synthesizable RTL models. Synthesis
places a number of coding restrictions on general purpose always procedures in
order to accurately translate the RTL model into ASIC or FPGA devices.
Latched logic sensitivity. Latches are a form of combinational logic blocks that can
store their current state. Modeling latched behavior follows the same sensitivity list
rules as modeling combinational logic behavior. The general purpose always key
word is followed by a sensitivity list that includes all signals that are read by that
block of logic, in the form of 0 ( <signal_name>, <signal_name>, ...), as in:
always 0 (enable, data)
if (enable) out <= data;
The always_latch specialized always procedure automatically infers a proper
combinational logic sensitivity list.
always_latch
if (enable) out <= data;
Chapter 9 discusses modeling latched logic in more detail, including best-practice
coding guidelines for using the always and always_latch procedural blocks.
always_comb
begin // begin-end is the single group
sum = a + b;
dif = a - b;
end
A statement can be nested within another statement, as in:
always 0 (posedge elk)
if (enable) // single outer statement
for (int i; i<=15; i++) // nested statement
out[i] = a[i] A b[(N-l)-i]; // another nested stmt
In the preceding code snippet, the outer statement is the single statement in the
always procedure, and therefore does not require a begin...end group.
A begin-end group can be named, using the syntax:
begin: <name>
A named statement group can contain local variable declarations. Local variables
can be used within the statement group, but cannot be referenced outside of the group
in synthesizable RTL models. (A later version of SystemVerilog added the ability to
declare local variables in unnamed begin-end groups, but this was not supported by
most synthesis compilers at the time this book was written.)
Optionally, the matching end of the group can also be named. Naming the end of a
statement group can help visually match up nested statement groups. System Verilog
requires that the names used for the begin and the end must match exactly.
The use of local variables help ensure proper synthesis results in certain contexts. A
temporary intermediate variable calculated in a sequential always procedure and used
by another procedure might appear to work in simulation, but can synthesize into
gate-level functionality that might not match the RTL simulation behavior. Declaring
a local variable within a procedure will prevent this coding error — a local variable
cannot be accessed from outside of the procedure.
216 RTL Modeling with SystemVerilog for Simulation and Synthesis
always_comb
sum = a + b; // sum must be a variable type
It is only the left-hand side of a procedural assignment that must be a variable. The
right-hand side of assignments can use variables, nets, parameters or literal values.
Operators that return a true/false result are listed in Chapter 5, sections 5.6, 5.7, 5.8
and 5.9.
Do not perform true/false tests on vectors. Evaluating vectors as true or false could
lead to design errors. In the preceding example, did the engineer writing the code
intend to test (a & b) , which is an 8-bit vector value, or (a && b), which is a 1-bit
result of a true/false logical operation? Which branch the if-else decision executes can
be different for some values of a and b. This ambiguity and possible coding bug will
be avoided by following the guideline to only use scalar (1-bit) values, or the return of
operations that have a true/false result.
With 4-state values, it is possible that an expression is neither true or false, as in the
value 8'b0000000z. An expression that is neither true nor false is considered to be
unknown. The false branch will be executed when the expression of an if-else deci
sion evaluates as unknown. This can cause a mismatch in how RTL models simulate,
and in how post-synthesis gate-level models actually behave. This circumstance is
discussed in Appendix C on X-optimism and X-pessimism in SystemVerilog models.
Each branch of an if-else decision can be a single statement or a group of state
ments enclosed between begin and end, as shown in the following code snippet.
218 RTL Modeling with SystemVerilog for Simulation and Synthesis
If statements without an else branch. The else (false) branch of an if-else deci
sion is optional. If there is no else branch, and the expression evaluates as false (or
unknown), then no statement is executed. In the following code snippet, if enable is
0, then out is not changed. Since out is a variable (see section 6.1.3, page 216), it
retains its previous value, modeling the storage behavior of a latch.
always_latch
if (enable) out <= data;
time, the reset has priority because it is evaluated first in the series of decisions. The
set and reset controls in this example are active-low signals.
always_ff @(posedge elk or negedge rstN or negedge setN)
if (!rstN) q <= '0; // reset register
else if (!setN) q <= '1; // set register
else q <= d; // clock the register
(This set-reset flip-flop example has a potential simulation glitch, which is dis
cussed in Chapter 8, section 8.1.5.4, page 290.)
Using if-else as a multiplexor. Example 6-1 and its accompanying synthesis result
in Figure 6-1 show if-else being used in the context of a multiplexor.
always_comb begin
if (sel) y = a;
else y = b;
end
endmodule: mux2tol
220 RTL Modeling with SystemVerilog for Simulation and Synthesis
always_latch begin
if (ena) out <= in;
end
endmodule: latch
Using if-else as a priority encoder. Example 6-3 illustrates an if-else-if in the con
text of a 4-to-2 priority encoder. (Example 6-6, page 227, shows a variation of this
same design.)
Figure 6-3: Synthesis result for Example 6-3: if-else as a priority encoder
The priority encoding in Figure 6-3 is implemented as series of logic gates where
the output of one stage becomes the input to the next stage in the series, rather than
encoding all of the bits of d_in in parallel. This serial data path is a result of the pri
ority in which the bits of d_in are evaluated in the if-else-if series.
222 RTL Modeling with SystemVerilog for Simulation and Synthesis
Example 6-4: Using if-else-if series to model a flip-flop with reset and chip-enable
module enable_ff
# (parameter N = 1) // bus size
(input logic elk, // posedge triggered elk
input logic rstN, // active low async reset
input logic enable, // active high chip enable
input logic [N-l 0 ] d, // scalable input size
output logic [N-l 0] q // scalable output size
)
always_ff 0 (posedge elk or negedge rstN) // async reset
if (IrstN) q <= '0; // active-low reset
else if (enable) q <= d; // store if enabled
endmodule: enable ff
Figure 6-4: Synthesis result for Example 6-4: if-else as a chip-enable flip-flop
Figure 6-4 shows how synthesis has mapped the chip-enable flip-flop with active-
low reset to a generic component. The next step in the process is for the synthesis
compiler to map this generic component to a specific type of flip-flop available in a
target ASIC or FPGA device. If that target device does not have a chip-enable flip-
flop, then synthesis will add multiplexor functionality outside of the flip-flop to
mimic the chip-enable behavior. The multiplexor will pass the new value of data to
the D input if the flip-flop is enable, and will feed the flip-flop Q output back to the D
input if the flip-flop is not enabled. In a similar manner, if the target device does not
have flip-flops with asynchronous active-low resets, the synthesis compiler will add
functionality outside of the flip-flop to mimic this behavior. Modeling and synthesiz
ing flip-flops with various types of resets is discussed in Chapter 8, section 8.1.5
(page 286).
Chapter 6: RTL Programming Statements 223
The default case item. An optional default case item can be specified by using the
default keyword. The default will be executed if the case expression did not match
any of the case items. In the example above, the case items cover all the possible 2-
state values of a 2-bit opcode. If opcode is a 4-state type, however, there are addi
tional X and Z values that are not decoded by the case items. If opcode should have
any bits that are X or Z, the default branch will be executed, which, in the preced
ing example, will propagate an X value onto the result variable. The default case
item does not need to be the last case item. Syntactically, the default can be the first
case item, or anywhere in the middle of the case items. A best-practice coding style
for code readability is to make the default case item the last case item.
224 RTL Modeling with SystemVerilog for Simulation and Synthesis
l'bz:
1 'bx:
endcase
end
With the case...inside case statement, the case expression is compared to the
case items using the behavior of the ==? wildcard case equality operator (see Chapter
5, section 5.8, page 168). The ==? operator allows bits to be masked from the com
parison. Any bit in a case item that is set to x or z or ? is masked, and that bit position
is ignored when the case expression is compared to the case item.
In the following example, the first branch will be executed if the most significant
bit of selector is set. All the remaining bits of selector are ignored. The second
branch will be taken if the upper two bits of selector have the value 01, and the
remaining bits are ignored, and so forth.
always_comb begin
case (selector) inside
4'bl???: out = a; // MSB is set
4'bOl??: out = b;
4'b001?: out = c;
4'b0001: out = d;
default: out = '0; / / n o bits are set
endcase
end
Chapter 6: RTL Programming Statements 225
The reason SystemVerilog replaced casex and casez is because they have a seri
ous flaw in their simulation rules that can synthesize into logic gates that behave very
differently than the RTL simulation. In brief, casex and casez not only allow bits to
be masked in the case items, but also allow masking bits in the case expression. This
double masking can lead to a branch being executed that was not intended, and that
might not be the same branch that the gate-level implementation created by synthesis
would take. The hazards of casex and casez are not discussed in this book because
there is no need to ever use these constructs — the case...inside statement makes
these older construct obsolete.
always_comb begin
case (select)
2'b00: y = a;
2'b01: y = b;
2'blO: y = c;
2'bll: y = d;
endcase
end
endmodule: mux4tol
Figure 6-5: Synthesis result for Example 6-5: case statement as a 4-to-l MUX
The case items in Example 6-5 are mutually exclusive, meaning it is not possible
for two of these case items to be true at the same time. Therefore, the synthesis com
piler removed the priority encoded behavior of the case statement, and implemented a
more gate-efficient parallel evaluation of the case items, in the form of a multiplexor.
The removal of priority logic by synthesis compilers occurs automatically, as long
as synthesis can determine that all case items are mutually exclusive (there will never
be two or more case items that evaluate as true at the same time). Synthesis compilers
will leave in the priority evaluation of case items if it cannot determine that the case
items are mutually exclusive.
Example 6-6 is similar to the 4-to-2 priority encoder shown in Example 6-3, but
this time uses case...inside to allow for checking only specific bits in the 4-bit
d_in value. Because other bits are ignored, there is a possibility of more than one
Chapter 6: RTL Programming Statements 227
case item being true at the same time. Simulation will execute the first matching
branch, and synthesis compilers will match that behavior by leaving in the priority
encoding that is inherent in case statements.
Figure 6-6: Synthesis result for Example 6-6: case...inside as a priority encoder
The effect of the priority logic can be seen in the series of gates through which dif
ferent bits of d_in propagate. The circuitry is very similar to what the synthesis com
piler generated for this same design when a series of if-else-if decisions were used, as
shown earlier in Example 6-3 and Figure 6-3 (page 221).
228 RTL Modeling with SystemVerilog for Simulation and Synthesis
always_comb begin
for (int i=0; i<=N-l; i++)
y[i] = a[i] A b[(N-l)-i]; // XOR a and reverse order of b
end
Synthesis compilers implement loops by first “unrolling” the loop, meaning the
statement or begin-end statement group in the loop is replicated the number of times
that the loop iterates. In the code snippet above, the assignment statement is replicated
four times, because i will iterate from 0 to 3. The code that synthesis sees after it
unrolls the loop is:
always_comb begin
y [0] = a [0] A b [3-0] ;
y [1] = a [1 ] A b [3-1];
y [2] = a [2] A b [3-2];
y [3] = a [3] A b [3-3];
end
The number of iterations a loop will execute must be a fixed number times in order
for synthesis to unroll the loop. Loops with a fixed number of iterations are referred to
as static loops, and are discussed in more detail in section 6.3.1.1 (page 230).
The advantage of loops becomes apparent when there are larger number of itera
tions. If a and b had been 64-bit busses in the for loop snippet above, it would have
required 64 lines of code to manually exclusive-or the two 64-bit busses. With a for
loop, only two lines of code are needed regardless of the vector size of the busses.
230 RTL Modeling with SystemVerilog for Simulation and Synthesis
Example 6-7 shows a complete parameterized model of the code snippet above.
Figure 6-7 shows the results of synthesizing this model.
always_comb begin
for (int i=0; i<N; i++) begin
y[i] = a[i] A b[(N-l)-i]; // XOR a and reverse order of b
end
end
Figure 6-7: Synthesis result for Example 6-7: for-loop to operate on vector bits
In can be seen in Figure 6-7 how the four iterations of the for loop were unrolled
to become four instances of the exclusive-or operation.
Synthesizable way to exit a loop without data dependence. Example 6-8 shows a
coding style for the preceding snippet that is synthesizable. Instead of depending on
the value of data to determine the end of the loop, Example 6-8 uses a static loop that
executes a fixed number of times. Rather than terminating the loop early when the
232 RTL Modeling with SystemVerilog for Simulation and Synthesis
lowest set bit is found, the loop simply does nothing for the remaining iterations, after
finding the lowest bit that is set. Figure 6-8 shows the results from synthesizing this
example. The bus size of data is parameterized in this example, and set to only 4-bits
wide in order to reduce the size of the schematic to fit the page size of this book.
Example 6-8: Using a fo r loop to find the lowest bit that is set in a vector
module find_lowest_bit
# (parameter N = 4) // bus size
(input logic [N—1:0] data,
output logic [$clog2(N):0] low_bit
);
always_comb begin
// find lowest bit that is set in a vector
low_bit = '0;
done = '0;
for (int i=0; i<=N-l; i++) begin
if (!done) begin
if (data[i]) begin
low_bit = i;
done = '1;
end
end
end
end
endmodule: find lowest bit
Figure 6-8: Synthesis result for Example 6-8: for-loop to find lowest bit set
Code all loops with a fixed iteration size. This coding style ensures the loop
can be unrolled, and will be supported by all synthesis compilers.
Chapter 6: RTL Programming Statements 233
A repeat loop executes a loop a set number of times. The general syntax of a
repeat loop is:
Example 6-9: Using a repeat loop to raise a value to the power of an exponent
module exponential
# (parameter E = 3, // power exponent
parameter N = 4, // input bus size
parameter M = N*2 // output bus size
)
(input logic elk,
input logic [N-1:0] d,
output logic [M-1:0] q
);
Figure 6-9 shows the result of synthesizing Example 6-9. With E having a value of
3, the repeat loop executes 2 times, resulting synthesis creating 2 instances of a multi
plier. Each bit of the output vector q is registered by a generic flip-flop. Only the first
of the output register flip-flops are shown in this figure.
Figure 6-9: Synthesis result for Example 6-9: repeat loop to raise to an exponent
Chapter 6: RTL Programming Statements 235
Synthesis timing considerations. A static, zero-delay for loop or repeat loop will
synthesize to combinational logic. If the output of this combinational logic will be
registered in flip-flops, then the total propagation delay of the combinational logic
inferred by the loop must be less than one clock cycle.
NOTE
The capabilities and limitations of each specific ASIC or FPGA device can
vary widely. RTL models that use the multiply, divide, modulus and power
operators should be written to match the capabilities of the target device.
Observe that, in Figure 6-9, the multipliers inferred by the repeat loop in Example
6-9 are cascaded. The total propagation delay of the chain of multipliers needs to fit
within one clock cycle in order for a valid and stable result to be registered in the out
put flip-flops. Some synthesis compilers can do register retiming, to insert or move
registers to create a pipeline within the combinational logic. Register retiming is a
feature of synthesis compilers, and is outside the scope of this book. Refer to the doc
umentation of a specific synthesis compiler for more information on this topic.
If register retiming is not available, then a loop that does not meet the clock period
of the design will need to be re-coded as a pipeline or state machine in order to manu
ally break the loop into multiple clock cycles.
Use for loops and repeat loops for RTL modeling. Do not use while and
do-while loops.
Although these loops are supported by many synthesis compilers, they have restric
tions that limit their usefulness in RTL models, and can make code difficult to main
tain and reuse. Instead, use for loops or repeat loops with a static number of times
the loop will iterate. The while and do-while loops are shown in this section for
completeness, but are not recommended.
A while loop executes a programming statement or begin-end group of statements
until an end_expression becomes false. The end expression is tested at the top of the
loop. If the end expression is false when the loop is first entered, the statement or
statement group is not executed at all. If the end expression is true, the statement or
statement group is executed, and then the loop returns back to the top and tests the
end expression again.
A do-while loop also executes a programming statement or begin-end group of
statements until an end_expression becomes false. With a do-while loop, the end
236 RTL Modeling with SystemVerilog for Simulation and Synthesis
expression is tested at the bottom of the loop. Thus, the statements in the loop will
always be executed a first time when the loop is first entered. If the end expression is
false when the loop reaches the bottom, the loop exits. If the end expression is true,
the loop returns back to the top and executes the statement or statement group again.
The following code shows a non-synthesizable example of using a while loop:
always_comb begin: count_ones
logic [15:0] temp; // local temporary variable
num_ones = 0;
temp = data;
while (temp) begin // loop as long as a bit in temp is set
if (temp[0]) num_ones++;
temp >>= 1; // shift bits of temp right by 1
end
end: count_ones
This example counts how many bits of the 16-bit data signal are set to 1. The
value of data is copied into a temporary variable called temp. If bit 0 of temp is set,
the num_ones counter is incremented. The temp variable is then shifted right 1 time,
which shifts out bit 0 and shifts a 0 into bit 15. The loop continues as long as temp
evaluates as true, meaning at least one bit of temp is still set. When temp evaluates as
false, the loop exits. A value in temp that has X or Z in some bits and no bits set to 1
would also cause the while loop to exit.
This example is non-synthesizable because the number of times the loop will exe
cute is data-dependent, rather than static, as discussed earlier in this chapter, in sec
tion 6.3.1.1 (page 230). Synthesis cannot statically determine how many times the
loop will execute, and therefore cannot roll out the loop.
The foreach loop is used to iterate through array elements. The foreach loop
will automatically declare its loop control variables, automatically determine the
starting and ending indices of the array, and automatically determine the direction of
the indexing (increment or decrement the loop control variables).
The following example iterates through a 2-dimensional array that represents a
look-up table with some data. For each element in the array, a function is called to do
some sort of manipulation on that value (the function is not shown).
bit [7:0] LUT [0:7] [0:255]; // look-up table (2-state)
NOTE
At the time this book was written, some synthesis compilers did not support
the foreach loop. Engineers should make sure all tools used in a project
support this loop type before using it in RTL models.
An alternate coding style to iterate through all dimensions of an array is to use for-
loops. The preceding example could be rewritten using static for loops that all syn
thesis compilers support.
always 0 (posedge elk)
if (update) begin
for (int i=0; i<=7; i++) begin
for (int j=0; j<=255; j++) begin
update_function(LUT[i] [j]);
end
end
end
Observe that, in this nested for-loop example, the size of each array dimension and
its starting and ending index values must be hard-coded to match the array declara
tion. SystemVerilog also provides array query system functions, which can be used to
make the for-loop more generic and adaptable to arrays of different sizes or parame
terized sizes. The preceding example can be written as:
238 RTL Modeling with SystemVerilog for Simulation and Synthesis
NOTE
At the time this book was written, some synthesis compilers did not support
the array query system functions. Engineers should make sure all tools used
in a project support these functions before using them in RTL models.
Following is a brief description of the array query system functions. Refer to the
IEEE 1800 SystemVerilog Language Reference Manual for more information on
these array query functions.
$right (array_name, dimension) — Returns the right-most index number of
the specified dimension. Dimensions begin with the number 1, starting from the left
most unpacked dimension. After the right-most unpacked dimension, the dimension
number continues with the left-most packed dimension, and ends with the right-most
packed dimension.
$left (array_name, dimension) — Returns the left-most index number of the
specified dimension. Dimensions are numbered the same as with $ right.
$increment (array_name, dimension) — Returns 1 if $left is greater than or
equal to $right, and -1 if $left is less than $right.
$ lo w ( array_name, dimension) — Returns the lowest index number of the spec
ified dimension, which may be either the left or the right index.
$high (array_name, dimension) — Returns the highest index number of the
specified dimension, which may be either the left or the right index.
$size (array_name, dimension) — Returns the total number of elements in the
specified dimension (same as $high - $low + l).
$dimensions (array_name) — Returns the number of dimensions in the array,
including both packed and unpacked dimensions.
Chapter 6: RTL Programming Statements 239
Jump statements allow procedural code to skip over one or more programming
statements. The SystemVerilog jump statements are continue, break and disable.
The continue and break jump statements are used within loops to control the
execution of statements within the loop. These jump statements can only be used in
for-loops, while-loops and foreach loops. They cannot be used outside of a loop.
The continue statement jumps to the end of a loop and evaluates the end expres
sion of the loop to determine if the loop should continue for another iteration. The fol
lowing code snippet uses a for-loop to iterate through the addresses of a small look-
up-table modeled as a 1-dimensional array of 16-bit words. Locations in the table
with a value of 0 are skipped by using the continue statement. For non-zero loca
tions, a function is called to do some sort of manipulation on that value (the function
is not shown).
bit [15:0] LUT [0:255]; // look-up table (2-state storage)
Example 6-10: Controlling for loop execution using continue and break
module find bit in range
240 RTL Modeling with SystemVerilog for Simulation and Synthesis
always_comb begin
low_bit = '0;
for (int i=0; i<N; i++) begin
if (i < start_range) continue; // skip rest of loop
if (i > end_range) break; // exit loop
if ( data[i] ) begin
low_bit = i;
break; // exit loop
end
end // end of the loop
// ... // process data based on lowest bit set
end
In this code snippet, the begin-end statement group was given the name
search_loop. The disable statement instructs simulation to immediately jump to the
end of this named begin-end group.
The original Verilog language did not have continue and break jump statements.
Instead the general purpose go-to behavior of the disable statement was used to
jump to the end of a loop, but continue execution of the next pass of the loop. The
disable statement also had to be sued to prematurely break out of a loop, by jump
ing past the end of the loop. To jump over statements within a loop but continue exe
cuting the loop, the named begin-end group must be contained within the loop. To
break out of a loop, the named begin-end group must enclose the entire loop.
The following example shows the same functionality as Example 6-10, except
using disable jump statements instead of continue and break statements.
always_comb begin
low_bit = '0;
begin: loop_block
for (int i=0; i<N; i++) begin: loop
if (i < start_range) disable loop; //skip rest of loop
if (i > end_range) disable loop_block; // exit loop
if ( data[i] ) begin
low_bit = i;
disable loop_block; // exit loop
end
end: loop
end: loop_block
// ... // process data based on lowest bit set
end
The disable jump statement can be used to give the same functionality as break
and continue jump statements, as shown above. However, the disable jump state
ment makes the code more difficult to read and to maintain. Using continue and
break is a simpler and more intuitive coding style.
The disable jump statement is a general purpose go-to that can be used in ways
that can be useful in verification testbenches. These other ways of using disable are
not generally supported by synthesis compilers.
242 RTL Modeling with SystemVerilog for Simulation and Synthesis
SystemVerilog has functions and tasks that make it possible to partition complex
functionality into smaller, reusable blocks of code. Functions can be very useful for
RTL modeling, and are examined in this section. Tasks, though synthesizable with
limitations, have little value in RTL models. Using void functions, which are dis
cussed later in this section, is a better RTL coding style than using tasks. Therefore,
tasks are only discussed briefly in this book.
Functions and tasks can be defined within the module or interface (see chapter 10)
in which they are used. The definition can appear before or after the statements that
call the function or task. Functions and tasks can also be defined in a package, and
then imported into the module or interface. The package import statement must
appear before the function or task is called. Packages and package importing are dis
cussed in Chapter 4, section 4.2 (page 102).
6.6.1 Functions
When called, a function executes its programming statements and returns value. A
call to a function can be used anywhere an expression such as a net or variable can be
used. An example function definition and call to the function are shown here. More
practical synthesizable examples are shown later in this section.
function automatic logic [N-1:0] factorial_f([N-l:0] in);
logic [N—1:0] f;
if (in <= 1) f = 1;
else f = in * factorial_f(in-1);
return f;
endfunction: factorial_f
Static and automatic functions. Functions (and tasks) can be declared as static or
automatic. If neither is specified, the default is static for functions defined in a
module, interface or package.
A static function retains the state of any internal variables or storage from one call
to the next. The function name and function inputs are implicit internal variables, and
will retain their values when the function exits. The effect of this static storage is that
a new call to a function can remember values from a previous call. This memory can
be useful in verification code, but the behavior does not always accurately model the
gate-level behavior that synthesis compilers implement from functions, which can
lead to a mismatch between the RTL model simulations and the actual functionality of
an ASIC or FPGA.
An automatic function allocates new storage each time the function is called.
Recursive function calls, such as the factorial_f function example shown above,
require automatic storage. (Re-entrant task calls, where two different procedures call
the same task at the same time, also require automatic storage.)
The default of static storage is not appropriate for RTL modeling of hardware
behavior. Furthermore, synthesis compilers require that functions declared in a pack
age or interface must be declared as automatic.
There is an historical reason that functions default to static storage. In the early
years of Verilog simulation, when computer memory was limited and processor
power was much slower, static storage helped improve simulation run-time perfor
mance. There is no real performance advantage of static storage versus automatic
storage with modem simulators and compute servers. The SystemVerilog standard
has kept the original language default of static functions in order to remain backward
compatible with legacy verification code that might have been written to utilize the
static storage of a function.
Function returns. The return data type of a function is defined immediately before
the name of the function. In the factorial_f example above, the function returns
an N-bit wide vector with a logic (4-state) type. If no return type is specified, func
tions return a 1-bit logic (4-state) type by default.
SystemVerilog provides two ways to specify the return value from a function. One
way is to use the return keyword, as shown in the preceding factorial_f exam
ple above. The return keyword is followed by the value to be returned by the func
tion. Optionally, this return value can be enclosed in parentheses.
A second way to specify the return value is to assign a value to the name of the
function. The function name is an implicit variable of the same data type as the return
Chapter 6: RTL Programming Statements 245
type. This implicit variable can be used for temporary storage while the function is
calculating the return value. The last value assigned to the function name becomes the
function return value. The factorial_f function shown at the beginning of this sec
tion could be re-coded to use the function name as an implicit internal variable to cal
culate and return a value.
function automatic logic [N-1:0] factorial_f([N-l:0] in);
if (in <= 1) factorial_f = 1;
else factorial_f = in * factorial_f(in-1);
endfunction: factorial_f
Void functions. Optionally, a function return type can declared as void. Void func
tions do not return a value, and cannot not be used as an expression like other func
tions. A void function is called as a statement, instead of as an expression.
typedef struct {
logic [31:0] data;
logic [ 3:0] check;
logic valid;
} packet_t;
The only difference between a void function and a task is that a function must exe
cute in zero time. Most synthesis compilers do not support any form of clock delay in
tasks. Using a void function in place of a task makes this synthesis restriction a syntax
requirement, and can prevent writing RTL models that simulate, but will not synthe
size.
A formal argument can also be declared as ref (short for reference) instead of a
direction. A ref argument is a form of a pointer to the actual argument of the call to
the function. A function must be declared as automatic to use ref arguments.
All RTL synthesis compilers support input and output function arguments. The
inout and ref arguments are not supported by some RTL synthesis compilers.
Calling functions . There are two coding styles for passing actual arguments to the
formal arguments when a function is called: pass-by-order and pass-by-name. With
pass-by-order, the first actual argument is passed to the first formal argument, the sec
ond actual argument to the second formal argument, and so forth. Pass-by-name uses
the same syntax as connecting modules by name. The name of the formal argument is
preceded by a period ( . ) , followed by the actual argument enclosed in parentheses.
Given the function definition:
function automatic int inc_f(int count, step);
return (count + step);
endfunction
The two styles of passing actual arguments are:
always_ff @ (posedge master_clk)
m_data <= inc_f(data_bus, 1); // pass-by-order
Function input default values . Formal input arguments can be assigned a default
value, as in:
function automatic int inc_f(int count, step=l);
return (count + step);
endfunction
Arguments with a default value do not need to be passed an actual value. If no
actual value is passed in, the default value is used. For example:
always_ff @ (posedge master_clk)
m_data <= inc_f( .count(data_bus) );
If an actual value is passed in, the actual value is used, as in:
always_ff @ (posedge slave_clk)
s data <= inc f( .count(data bus), .step(8) );
Chapter 6: RTL Programming Statements 247
N O TE
Default input values were not supported by some synthesis compilers at the
time this book was written. Engineers should make sure all tools in the
design flow used in a project support default input values before using them
in RTL models.
Using return to exit a function early. The return statement can also be used to
exit from a function before all statements in the function have been executed. The fol
lowing example can exit the function at 3 different points. If the max input is 0, the
function exits prior to executing the for loop. If the for loop iterator reaches the
value of max, the function exits before reaching the end of the loop. If the for loop
completes, the function exits when the endfunction is reached.
parameter N = 32;
always_comb begin
yl6 = Functions # (.SIZE (16)) ::adder_f(al6, bl6);
// reconfigure to 16-bit adder
y32 = Functions # (.SIZE (32)) ::adder_f(a32, b32);
// reconfigure to 32-bit adder
end
Parameterized functions make it is possible to create and maintain only one version
of the function, instead of having to define several versions with different data types,
vector widths, or other characteristics.
Observe that, in a class definition, the static keyword comes before the func
tion keyword, whereas in a module, the static or automatic keyword comes
after the function keyword. There is an important semantic difference. In a class,
static function declares the lifetime of the function within the class, and restricts
what the function can access within the class, in a module, function static or
function automatic refers to the lifetime of the arguments and variables within
the function.
N O TE
At the time this book was written, not all synthesis compilers supported
static functions in parameterized virtual classes. Engineers should make sure
all tools used in a project support static functions in parameterized virtual
classes before using them in RTL models.
6.6.2 Tasks
A task is a subroutine that encapsulates one or programming statements, so that the
encapsulated statements can be called from different places or reused in other proj
ects. Unlike functions, tasks do not have a return value. An example task is:
task automatic ReverseBits (input [N-1:0] in,
output [N-1:0] out);
for (int i=0; i<N; i++)
out[(N-l)-i] = in[i];
endtask
A task is called as a programming statement, and uses output formal arguments to
pass values out of the task.
Chapter 6: RTL Programming Statements 249
6.7 Summary
RTL simulators and synthesis compilers need to know when to execute programming
statements. An always procedure with a sensitivity list (explicit or inferred) is used to
control when statements are executed. SystemVerilog has four types of always proce
dures, the generic always, and the type-specific, always_ff, always_comb and
always_latch procedures. This chapter has introduced these constructs and used
them in a number of code examples. The next chapters will examine the proper usage
of these always procedures in much greater detail as the topics of modeling combina
tional logic, sequential logic and latched logic components are examined.
Programming statements can also be contained in functions and tasks, which are
called from always procedures. The rules and best coding practices for functions and
tasks were covered in this chapter.
* * *
251
Chapter 7
Modeling Combinational Logic
Abstract — This chapter builds on the programming statements and operators dis
cussed in the previous chapters, and adds more details on best-practice coding styles
for RTL models of combinational logic. An emphasis is placed on writing RTL mod
els that ensure simulation behavior matches post-synthesis gate-level behavior.
Digital gate-level circuitry can be divided into two broad categories: combinational
logic, discussed in this chapter, and sequential logic, discussed in the next chapter.
Latches are a cross between combinational and sequential logic, and are treated as a
separate topic in Chapter 9.
Combinational logic describes gate-level circuitry where the outputs of a block of
logic directly reflect a combination of the input values to that block. The output of a
two-input AND gate, for example, is the logical-and of the two inputs. If an input
value changes, the output value will reflect that change. RTL models of combina
tional logic need to reflect this gate-level behavior, meaning that the output of a block
of logic must always reflect a combination of the current input values to that block of
logic.
SystemVerilog has three ways to represent combinational logic at a synthesizable
RTL level: continuous assignments, always procedures, and functions. Each of these
coding styles is explored in this chapter, and best-practice coding styles are recom
mended.
The topics presented in this chapter include:
• Continuous assignment statements
• Always procedures, when modeled following strict coding guidelines
• The always_comb procedure and simulation rules
• The obsolete always 0* procedure
• Using functions to model combinational logic
252 RTL Modeling with SystemVerilog for Simulation and Synthesis
Left-side types. The left-hand side of a continuous assignment can be a scalar (1-bit)
or vector net or a variable type, or a user-defined type. The left-hand side cannot be an
unpacked structure or unpacked array.
There is an important difference between using a net and a variable on the left-hand
side of a continuous assignment:
• Net types, such as wire or tri, can be driven by multiple sources, including multi
ple continuous assignments, multiple connections to output or inout ports of mod
ule or primitive instances, or any combination of drivers.
• Variable types, such as var or int, can only be assigned a value from a single
source, which can be: a single input port, a single continuous assignment, or any
number of procedural assignments (multiple procedural assignments are considered
to be a single source; synthesis requires the multiple procedural assignments be in
the same procedure).
Note that the logic keyword infers a data type, but is not, in itself, a net or variable
type. When logic is used by itself, a variable is inferred, with its single-source
assignment restriction). A variable is also inferred when the keyword pair
output logic is used to declare a module port, when the keyword pair
input logic or inout logic is used to declare a module port, a wire net type is
inferred, with its multiple driver capability.
Chapter 3, sections 3.4.1 (page 68) and 3.6.1 (page 84) discuss the rules and proper
usage of the logic type in more detail.
Chapter 7: Modeling Combinational Logic 253
Only use a net type (such as wire or tri) when multiple drivers are intended, such
as for a shared bus, a tri-state bus or an inout bidirectional module port. See section
3.6.1 (page 84) for more information declaring module port data types.
For RTL modeling, there is an important advantage to the semantic rule that vari
ables can only have a single source. Most signals in ASIC and FPGA devices are
expected to be single-source logic, with the exception of tri-state busses and bidirec
tional ports. The single-source restriction of variables can help prevent inadvertent
coding errors, where multiple continuous assignments or connections are made to the
same signal. With variable types, a multiple-source coding mistake will be reported as
a compilation or elaboration error in both simulation and synthesis.
Example 7-1: Add, multiply, subtract dataflow processing with registered output
module dataflow
# (parameter N = 4) // bus size
(input logic elk, // scalar input
input logic [N-l:0] a, b, c, // scalable input size
input logic [ 1:0] factor, // fixed input size
output logic [N-l:0] out // scalable output size
);
assign sum = a + b;
assign diff = prod - c;
assign prod = sum * factor;
endmodule: dataflow
Figure 7-1: Synthesis result for Example 7-1: Continuous assignment as comb, logic
always procedure is used to model the input functionality in order to trigger on rising
edges of the clock.
module SRAM (inout wire [7:0] data,
input logic [7:0] addr,
input logic rw, // 0 = read, 1 = write
input logic elk
);
endmodule: SRAM
The data bus is a bidirectional inout port, and must be a net type, such as wire or
tri, in order to have multiple drivers. The data bus can be driven by the RAM when
it is an output from the RAM, and by some other module when data bus is an input
writing into the RAM. Only continuous assignment can assign to net data types.
Each continuous assignment and each always procedure is a separate process that
runs in parallel, beginning at simulation time zero and running throughout simulation.
The order of continuous assignments and always procedures within a module does not
matter because the processes are running in parallel.
The primary RTL modeling construct for combinational logic is the always proce
dure, using either the general purpose always keyword or the RTL-specific
always_comb keyword. These always procedures can take advantage of the robust
set of operators programming statements that are discussed in Chapters 5 and 6,
whereas continuous assignments are limited to using only SystemVerilog operators.
Examples of a simple combinational logic adder modeled as an always procedure
and an always_comb procedure are:
always @(a, b) begin
sum = a + b;
end
always_comb begin
sum = a + b;
end
Chapter 7: Modeling Combinational Logic 257
Synthesis will not allow 0 or w a i t time control delays, and will ignore # delays.
Ignoring # delays can lead to mismatches in the RTL models that were verified in
simulation that used delays, and the gate-level implementation from synthesis that
ignored the delays.
Each signal in the sensitivity list can be separated by a comma,as in the example
above, or by the keyword o r , as in: 0(a o r b o r mode). There is no advantage or
disadvantage to using commas versus the o r keyword. Some engineers prefer the
comma-separated list because the o r keyword could be mistaken as a logical-OR
operation, rather than just a separator between signals in the list.
Complete sensitivity lists. With combinational logic, the outputs of the combina
tional block are a direct reflection of the current values of the inputs to that block. In
order to model this behavior, the a l w a y s procedure needs to execute its programming
statements whenever any signal changes value that affects the outputs of the proce
dure. An input to the combinational always procedure is any signal of which the value
is read by the statements in the procedure. In adder example above, the inputs to the
procedure — the signals that are read within the procedure — are: a, b and mode.
Procedure inputs versus module inputs. The inputs to a combinational logic pro
cedure might not correspond to the input ports of the module containing the proce
dure. A module might contain several procedural blocks and continuous assignments,
and, therefore, have input ports for each of these blocks. A module might also contain
internal signals that pass values between procedural blocks or continuous assign
ments. These internal signals will not be included in the module port list.
If mode changes value, the result output will not be updated to the new operation
result until either a or b changes value. The value of result is incorrect during the
time between when mode changed and a or b changed.
Chapter 7: Modeling Combinational Logic 259
This coding mistake is an obvious one in small combinational logic blocks that only
read the values of a few signals, but it is not uncommon for larger, more complex
blocks of logic to read 10, 20 or even several dozen signals. It is easy to inadvertently
omit a signal in the sensitivity list when so many signals are involved. It also common
to modify an always block during the development of a design, adding another signal
to the logic, but forgetting to add it to the sensitivity list.
A serious hazard with this coding gotcha is that many synthesis compilers will still
implement this incorrect RTL model as gate-level combinational logic, possibly with
a warning message that is easy to overlook. Though the implementation from synthe
sis might be what the designer intended, it is not the design functionality that was ver
ified during RTL simulation. Therefore, the design functionality was not fully
verified, which could result a bug in the actual ASIC or FPGA.
The obsolete always @* procedure. The IEEE 1364-2001 standard, often referred
to as Verilog-2001, attempted to address the gotcha of incomplete sensitivity lists
with the addition of a special tokens that would automatically infer a complete sensi
tivity list, 0*. For example:
a lw a y s 0 * b e g in
if (!mode) result = a + b; // add when mode = 0
e ls e result = a - b;
end
The original Verilog language that was introduced in the 1980s only had the general
purpose a l w a y s procedure. Though very useful, the general purpose nature of this
procedure has important limitations when used for RTL modeling. As a general pur
pose procedure, a l w a y s can be used to model combinational logic, sequential logic,
latched logic and various verification processes. When a synthesis compiler encoun
ters an a l w a y s procedure, the compiler has no way to know what type of functional
ity a design engineer intended to model. Instead, a synthesis compiler must analyze
the contents of the procedure and try to infer a designer’s intent. It is all too possible
for synthesis to infer a different type of functionality than what an engineer intended.
Another limitation of the general purpose a l w a y s procedure is that it does not
enforce RTL coding rules required by synthesis compilers for representing combina
tional logic behavior, as summarized in section 7.2.1 (page 257). Models using gen
eral purpose a l w a y s procedures might appear to simulate correctly, but might not
synthesize to the intended functionality, resulting in lost engineering time by having
to rewrite the RTL models and reverify the functionality in simulation before the
model can be synthesized.
In this procedure, the variable sum is immediately updated to the result of the oper
ation a + b. This new value of sum then flows to the next statement, where the new
value is used to calculate a new value for prod. This new value for prod then flows
to the next line of code and is used to calculate the value of result.
262 RTL Modeling with SystemVerilog for Simulation and Synthesis
The blocking behavior of the assignment statement is critical for this dataflow to
simulate correctly in a zero-delay RTL model. The blocking assignment in each line
of code blocks the evaluation of the next line, until the current line has updated its
left-hand side variable with a new value. The blocking of the evaluation of each sub
sequent line of code is what ensures that each line is using the new value of variables
assigned by the previous lines.
Had nonblocking assignments been inappropriately used in the code snippet above,
each assignment would have used the previous values of its right-side variables,
before those variables were updated to new values. This is not combinational logic
behavior! Synthesis compilers, however, might still create combinational logic when
nonblocking assignments are used, resulting in the behavior that was verified in RTL
simulation not matching the actual gate-level behavior after synthesis.
Simulation event scheduling, and the execution of blocking and nonblocking
assignments, are discussed in more detail in Chapter 1, section 1.5.3.5 (page 27).
2. A decision statement does not execute a branch for every possible value of the
decision expression. The following code snippet illustrates this problem.
a lw a y s _ c o m b b e g i n
case (opcode)
2'b00: result = a + b;
2'bOl: result = a - b;
2'blO: result = a * b;
en d case
end
value of 2 'bll, this example does not make any assignment to the result variable.
Because result is a variable, it retains its previous value. The retention of a value
behaves as a latch, even though the intent is that the a l w a y s _ c o m b procedural behave
as combinational logic.
A latch will be inferred even when an a l w a y s _ c o m b procedure is used. Synthesis
compilers and lint checkers, will however, report a warning or non-fatal error that a
latch was inferred in an a l w a y s _ c o m b procedure. This warning is one of the several
advantages of a l w a y s _ c o m b over a general a l w a y s procedures An a l w a y s _ c o m b
procedure documents the design engineers intent, allowing software tools to report
when the code within the procedure does not match that intent. Chapter 9 discusses
the proper coding style for representing latches in RTL models, and how to avoid
unintentional latches when combinational logic is intended.
m o d u l e algorithmic_multiplier
i m p o r t definitions_pkg::*;
(in p u t l o g i c [3:0] a, b,
o u t p u t l o g i c [7:0] result
);
a s s i g n result = multiply_f(a, b);
e n d m o d u l e : algorithmic multiplier
Figure 7-2: Synthesis result for Example 7-2: Function as combinational logic
System Verilog semantics for if-else-if decision series and case statements is that the
series of choices are evaluated sequentially. Only the first matching branch is exe
cuted. This behavior makes it possible to represent priority encoded logic, where one
choice takes precedence over another. The following code snippet illustrates a 4-to-2
priority encoder modeled with an if-else-if decision chain, where high-order bits take
precedence over lower-order bits.
lo g ic [3:0] d_in;
lo g ic [1:0] d_out;
a l w a y s co m b b e g i n
if (d in[3]) d_ out = 2'h3; // bit 3 is set
e ls e i f (d in[2]) d_ out = 2'h2; // bit 2 is set
e ls e i f (d in[1]) d_ out = 2 'hi; // bit 1 is set
e ls e i f (d in[0]) d_ out = 2'h0; // bit 0 is set
e ls e d out = 2'hX; // no bits set
end
This same priority encoder can also be modeled by using a case statement. (This
example uses a coding style referred to as a reverse case statement, a coding tech
nique discussed in more detail in Chapter 8, section 8.2.5, page 313.)
a lw a y s _ c o m b b e g i n
c a s e (d_in) i n s i d e
4'bl???: d_out= 2'h3; // bit 3 is set
4'b01??: d_out= 2'h2; // bit 2 is set
4'b001?: d_out= 2 'hi; // bit 1 is set
4'b0001: d_out= 2'h0; // bit 0 is set
4'b0000: d_out = 2'hX; // no bits set
en d case
end
266 RTL Modeling with SystemVerilog for Simulation and Synthesis
The if-else-if example and the case statement example are functionally identical,
and will synthesize to equivalent gate-level circuitry. The results from synthesizing
these examples are shown in Chapter 6, Figures 6-3 (page 221) and 6-6 (page 227),
respectively.
a lw a y s _ c o m b b e g i n
case (current_state)
READY : next_state = SET;
SET : next_state = GO;
GO : next_state = READY;
default: next_state = READY;
en d case
end
Chapter 8, section 8.2 (page 299) discusses modeling Finite State Machines in more
detail, and shows the full context of combinational logic state decoders.
pessimistic — they will leave in the priority-encoded logic in the gate-level imple
mentation, just in case it is needed. This situation typically occurs when either:
• The case item expressions use wildcard bits that could be any value. The
c a s e i n s i d e decision allows wildcard bits. Since these bits can be any value, it
might be possible for the case expression to match multiple case items.
• The case item expressions use variables. Synthesis is a static compilation process,
and, therefore, cannot determine if the values of variables will never overlap.
Example 7-3 is a reverse case statement one-hot decoder, where the case items are
bits of a variable. (This style is discussed in Chapter 8, section 8.2.5, page 313).
Example 7-3: State decoder with inferred priority encoded logic (partial code)
t y p e d e f enum l o g i c [2:0] {READY= 3'bOOl,
SET = 3'bOlO,
GO = 3'blOO} states_t;
a lw a y s _ c o m b b e g i n
{get_ready, get_set, get_going} = 3'bOOO;
c a s e (l'bl)
current_state[0]: get_ready = '1;
current_state[1]: get_set = '1;
current_state[2]: get_going = '1;
en d case
en d
The designer might know that current_state uses one-hot encoding, and there
fore the case items are mutually exclusive. Synthesis compilers, however, cannot stat
ically determine that the value of the current_state variable will only have a
single bit set in all circumstances. Therefore, synthesis will implement this one-hot
decoder with priority encoded logic. The case statement will not be automatically
optimized for parallel evaluation. Figure 7-3 shows the results of synthesizing this
reverse case statement.
Figure 7-3: Synthesis result for Example 7-3: Reverse case statement with priority
268 RTL Modeling with SystemVerilog for Simulation and Synthesis
Observe the series of buffers and logic gates in order to decode even this very sim
ple one-hot set of values. This is because the synthesis compiler is not able to recog
nize that the c u r r e n t _ s t a t e variable will only have one-hot values, and, therefore,
the case items are mutually exclusive.
The unique decision modifier. When synthesis cannot automatically detect that the
case item values are mutually exclusive, the design engineer needs to inform the syn
thesis compiler that the case items are indeed unique from each other. This can be
done by adding a u n i q u e decision modifier before the c a s e keyword, as in the fol
lowing example.
Example 7-4: State decoder with unique parallel encoded logic (p a rtia l co d e)
t y p e d e f enum l o g i c [2:0] {READY= 3 ' b 0 0 1 ,
SET = 3 ' b O l O ,
GO = 3'blOO} states t;
a lw a y s _ c o m b b e g i n
{get_ready, get_set, get_going} = 3' bOOO;
u n iq u e case (l'bl)
c u r r e n t _ s t a t e [0]: get_ready = ' 1;
c u r r e n t _ s t a t e [1]: get_set = ' 1;
c u r r e n t _ s t a t e [2]: get_going = ' 1;
en d case
end
Figure 7-4: Synthesis result for Example 7-4: Reverse case statement, using unique
Using u n i q u e instructs the synthesis compiler that the case items can be evaluated
in parallel. This significantly reduced the number of gates and propagation paths for
this one-hot decoder, compared to the priority implementation shown in Figure 7-3.
For synthesis, the u n i q u e decision modifier indicates that every case item expres
sion will have a mutually exclusive, “unique” values, and therefore the gate-level
implementation can evaluate the case items in parallel. The u n i q u e modifier further
informs synthesis that any case expression values that were not used in the case state
Chapter 7: Modeling Combinational Logic 269
ment can be ignored. This can trigger synthesis optimizations that reduce gate counts
and propagation paths, but these optimizations might not be desirable in some
designs. The synthesis effects and best practice guidelines for using u n i q u e are dis
cussed in Chapter 9, section 9.3.5 (page 340).
For simulation, u n i q u e enables run-time error checking. A violation message will
be reported if:
• There are never multiple case item expressions true at the same time
• There is a branch for every case expression values that occurs.
Most case statements do not need, and should not use, the u n i q u e decision modi
fier. The u n i q u e modifier can result in synthesis gate-level optimizations that might
not be desirable in many designs.
The reverse case statement coding style shown in Examples 7-3 and 7-4 is one of
the few exceptions where synthesis compilers require a decision modifier in order to
achieve optimal Quality of Results (QoR).
NOTE
At the time this book was written, one commercial synthesis compiler did
not recognize / / synthesis as a synthesis pragma. That compiler required
that pragmas start with / / pragma or / / synopsys.
Synthesis compilers are very good at automatically detecting when a case statement
can be implemented as a parallel decoder without affecting design functionality.
In the rare situations where the synthesis compiler needs to be told to use a parallel
implementation, use the u n i q u e decision modifier. The u n i q u e decision modifier
Chapter 7: Modeling Combinational Logic 271
informs synthesis compilers that the case items can be treated as mutually exclusive
in the same way as the parallel_case pragma, but the decision modifier adds sim
ulation run-time checking to help detect potential problems with parallel decoding of
the case items during RTL simulation.
(The u n iq u e O decision modifier more accurately describes the parallel_case
synthesis pragma, but this book does not recommend the use of u n iq u e O because it
was not supported by most synthesis compilers at the time this book was written.)
7.5 Summary
This chapter has examined best coding practices for representing combinational logic
in RTL models. Proper simulation behavior and synthesis Quality of Results (QoR)
have been considered. The simple definition of combinational logic is that the output
values are always representing a combination of the input values. If any input changes
value, the output is updated to reflect this change. This chapter presented three Sys
temVerilog modeling constructs that can model combinational logic when used prop
erly: continuous assignments, always procedures, and functions.
This chapter also discussed the implied priority when case statements simulate. When
this priority evaluation is not needed, such as with mutually exclusive case item val
ues, synthesis compilers will remove the unnecessary priority encoded logic from the
gate-level implementation of case statements. Synthesis compilers will almost always
do this automatically, but there are rare situations where the design engineer needs to
guide the synthesis compiler. The u n i q u e and u n iq u e O decision modifiers are the
best practice modeling style for this rare situation. The obsolete parallel_case
synthesis pragma should never be used.
273
Chapter 8
Modeling Sequential Logic
Abstract — Digital gate-level circuitry can be divided into two broad categories:
c o m b in a tio n a l lo g ic , discussed in the previous chapter, and s e q u e n tia l lo g ic. Flip-
flops — clock-triggered sequential logic — are discussed in this chapter. Latches —
level-sensitive sequential logic — are discussed in the next chapter.
describes gate-level circuitry where the output reflects a value that
S e q u e n tia l lo g ic
has been stored by an internal state of the gates. Only certain input changes, such as a
clock-edge, cause the storage to change. For D-type flip-flops, a specific edge of the
clock input will change the storage of the flip-flop, but changes to the D input value
do not direct directly change the storage. Instead, the specific clock edge causes the
internal storage of the flip-flop to be updated to the value of the D input at the time of
the clock edge.
RTL models of sequential logic need to reflect this gate-level behavior, meaning
that the output of a block of logic must store a value over one or more clock cycles,
and only update the stored value for specific input changes, but not all input changes.
At the RTL level, an a l w a y s or a l w a y s _ f f procedure is used to model this sequen
tial behavior. This chapter examines:
• Synthesis requirements for RTL sequential logic
• The a lw a y s _ ff sequential logic RTL procedure
• Sequential logic clock-to-Q propagation and setup/hold times
• Using nonblocking assignments to model clock-to-Q propagation effect
• Synchronous and asynchronous resets
• Multiple clocks and clock domain crossing (CDC)
• Using unit delays in sequential logic RTL models
• Modeling Finite State Machines (FSMs)
• Modeling Mealy and Moore FSM architectures
• State decoders, and using u n iq u e c a s e for 1-hot decoders
• Modeling memory devices such as RAMs
274 RTL Modeling with SystemVerilog for Simulation and Synthesis
Flip-flops and registers are used to store information for some period of time. The
terms flip-flop and register are often used synonymously, even though there can be
differences in how they are loaded and reset. Flip-flops are a storage element that
change the state of storage on a clock edge. A wide variety of hardware applications
can be built from flip-flops, such as counters, data registers, control registers, shift
registers, and state registers. Registers can be built from any type of data storage
device, including flip-flops, latches and RAMs. Most hardware registers are built
from flip-flops.
RTL models of clocked sequential logic flip-flops and registers are modeled with
an a l w a y s or a l w a y s _ f f procedure with a sensitivity list that uses a clock edge to
trigger evaluation of the procedure. An example of an RTL flip-flop is:
a lw a y s _ ff @ (p o sed g e elk)
q <= d;
In general, RTL models are written to trigger flip-flops on the positive edge of a
clock input. All ASIC and FPGA devices support flip-flops that trigger on a rising
edge of the clock (the positive edge). Some ASIC or FPGA devices also support flip-
flops that trigger on a falling edge of a clock. Flip-flops, and registers made from flip-
flops, can either be non-resettable or resettable. The reset can be synchronous or asyn
chronous to the clock trigger. Some flip-flops also have an asynchronous set input.
At the gate-level of design, there are several types of flip-flops, such as: SR, D, JK
and T flip-flops. RTL models can abstract from this implementation detail, and be
written as generic flip-flops. In RTL modeling, the focus is on design functionality,
not design implementation. It is the role of a synthesis compiler to map the abstract
RTL functional description to a specific gate-level implementation. Most ASIC and
FPGA devices use D-type flip-flops, so this book assumes that this will be the type of
flip-flop synthesis compilers will infer from an RTL flip-flop.
• The procedure should execute in zero simulation time. Synthesis compilers ignore
# delays, and do not permit 0 or w a i t time controls. An exception to this rule is the
use of intra-assignment unit delays (see section 8.1.7.1, page 297).
• A variable assigned a value in a sequential logic procedure cannot be assigned a
value by any other procedure or continuous assignment (multiple assignments
within the same procedure are permitted).
• A variable assigned a value in a sequential logic procedure cannot have a mix of
blocking and nonblocking assignments. For example, the reset branch cannot be
modeled with a blocking assignment and the clocked branch modeled with a non-
blocking assignment.
• Any other procedure, continuous assignment or input port assigns to the same vari
ables as the a l w a y s _ f f procedure.
The IEEE standard also suggests, but does not require, that software tools check for
other synthesis restrictions, such as an incorrect sensitivity list. Design engineering
tools such as synthesis compilers and lint checkers (that check coding style) perform
these optional checks, but most simulators do not perform additional checking on
a l w a y s _ f f procedures. These errors and optional additional checking help to ensure
that RTL models with sequential logic will both simulate correctly and synthesize
correctly.
The a l w a y s _ f f procedure must be followed by a sensitivity list that meets synthe
sis requirements. The sensitivity list cannot be inferred from the body of the proce
dure in the way a l w a y s _ c o m b can infer a sensitivity list. The reason is simple. The
clock signal is not named within the body of the a l w a y s _ f f procedure. The clock
name, and which edge of the clock triggers the procedure, must be explicitly specified
by the design engineer in the sensitivity list.
At the implementation level in ASICs and FPGAs, clocked sequential logic has
characteristics that are unique from combinational logic. One of these characteristics
is the propagation delay from when the clock input triggers to when the flip-flop out
put changes. This is often referred to as the clock-to-Q delay. A second characteristic
is the setup and hold time. The setup time is the period of time in which the data input
must be stable before a clock trigger. The hold time is the period of time in which data
must remain stable after the clock trigger. If data should change within the setup and
hold period, the value stored as the new flip-flop state will be uncertain. Under these
conditions, it is also possible for a flip-flop’s state to oscillate between values for a
period of time before settling to a stable value. This unstable period is referred to as
metastability.
Abstract RTL models should be zero-delay models — a requirement for best syn
thesis Quality of Results (QoR) — which means the RTL models do not have propa
gation delays. The output of a flip-flop changes at the same moment of simulation
time in which the clock trigger occurs, without the gate-level clock-to-Q propagation
delay. As zero-delay models, abstract RTL flip-flops also do not have setup and hold
times, and cannot go metastable. Nevertheless, the behavior clock-to-Q propagation
Chapter 8: Modeling Sequential Logic 277
must be represented in abstract RTL models, and the RTL models need to reflect
proper design techniques to avoid metastable conditions once implemented in an
ASIC or FPGA.
An example output from this 4-bit Johnson counter after reset is:
cnt[0:3] = 0000
cnt[0:3] = 1000
cnt[0:3] = 1100
cnt[0:3] = 1110
cnt[0:3] = 1111
cnt[0:3] = 0111
cnt[0:3] = 0011
cnt[0:3] = 0001
cnt[0:3] = 0000
The cascade effect from one flip-flop to the next is readily apparent in this output.
The 0 output from the last flip-flip , DFF4, is inverted, and becomes a 1 on the D input
of the first flip-flip, d f f i . On the first clock cycle, this 1 is stored into d f f i , while
the old state of d f f i , a 0, is cascaded into DFF2. On the second clock cycle, the 1 on
the output of d ffi is cascaded into DFF2. On the third clock cycle, the 1 in DFF2 cas
cades to DFF3, and on the fourth clock cycle the 1 in DFF3 cascades into DFF4. After
that fourth clock, the DFF4 output becomes 1, and the D input to d ffi becomes 0. On
278 RTL Modeling with SystemVerilog for Simulation and Synthesis
the next clock cycle, that 0 loads into d f f i , and that 0 cascades through the four flip-
flops on each subsequent clock cycle.
The Johnson counter design depends on the clock-to-Q propagation delay of each
flip-flop, which allows the previous state of each flip-flip in the series to be a stable D
input for each subsequent stage in the series of flip-flops. It is critical that RTL mod
els maintain this clock-to-Q propagation delay behavior, even though the RTL code is
modeled with zero-delays. This all-important characteristic of flip-flop behavior is
represented by the nonblocking assignment token ( <= ).
• E x e c u te p ro g ra m m in g s ta te m e n ts a n d o p e ra to rs
• C o n tin u o u s a s s ig n m e n ts :
E v a lu a te rig h t-h a n d s id e a n d u p d a te le ft-h a n d s id e
• B lo c k in g a s s ig n m e n ts :
E v a lu a te rig h t-h a n d s id e a n d u p d a te le ft-h a n d s id e
• N o n b lo c k in g a s s ig n m e n ts :
S T E P 1 - E v a lu a te rig h t-h a n d s id e A D elta
C ycle
A D elta
Figure 8-3: Synthesis result for Example 8-1: Nonblocking assignments, J-Counter
a lw a y s _ ff elk)
@ (p o sed g e
if (!rstN) ent <= '0; // synchronous active-low reset
e l s e b e g in / / shift the count
ent[1] <= ent[0];
ent [3] <= ent[2];
cnt[0] <= ~cnt[3];
ent[2] <= ent[1];
end
a l w a y s f f @ ( p o s e d g e elk)
if ( ! rstN) ent < = ' 0; // reset
e ls e ent [ 0 ] < = ~cnt[3]; // store
a l w a y s f f @ ( p o s e d g e elk)
if ( ! rstN) ent < = ' 0; // reset
e ls e ent[2 ] < = ent [ 1 ] ; // store
a l w a y s f f @ ( p o s e d g e elk)
if ( ! rstN) ent < = ' 0; // reset
e ls e ent [ 1 ] < = ent [ 0 ] ; // store
a l w a y s f f @ ( p o s e d g e elk)
if ( ! rstN) ent < = ' 0; // reset
e ls e ent [ 3 ] < = ent[2] ; // store
A simulator can schedule the assignments from the four concurrent procedures in
any order within the Active and NBA Update regions. The order does not matter,
because of the two-step execution process of nonblocking assignments.
282 RTL Modeling with SystemVerilog for Simulation and Synthesis
Example 8-2: 4-bit Johnson counter incorrectly modeled with blocking assignments
m o d u l e jcounter_bad
(o u tp u t lo g i c [0:3] cnt,
in p u t lo g ic elk, rstN
);
a lw a y s _ ff 0 ( p o s e d g e elk)
if (IrstN) cnt <= '0; // synchronous active-low reset
e l s e b e g in // shift the count
cnt[0] = ~cnt[3];
cnt[1] = cnt[0];
cnt[2] = cnt[1];
cnt [3] = cnt[2] ;
end
e n d m o d u le : jcounter bad
The simulation results from this example, after the counter is reset, are:
cnt[0:3] = 0000
cnt [0:3] = 1111
cnt[0:3] = 0000
cnt[0:3] = 1111
cnt[0:3] = 0000
The blocking assignments behave as combinational logic, and do not have the
clock-to-Q propagation delta required to model the cascade effect of flip-flops con
nected in series. Synthesis will implement this example by collapsing the four assign
ments into a single flip-flop, with the single output going to all four bits of the cnt
signal.
Figure 8-4 shows the results of synthesizing Example 8-2.
Chapter 8: Modeling Sequential Logic 283
Figure 8-4: Synthesis result for Example 8-2: Blocking assignments, bad J-Counter
Synthesis compilers might not give any warning or error messages because of the
incorrect usage of blocking assignments to represent sequential logic flip-flops. The
synthesis complier simply creates an implementation that matches the way the RTL
model simulates. Design engineers need to understand how blocking and nonblocking
assignments behave, and use them correctly. Running lint check programs on RTL
models, which check code for proper modeling style, will report warnings when
blocking assignments are used in procedures that are triggered on a clock edge.
The next example violates this guideline in order to illustrate the difference in how
blocking and nonblocking assignments behave.
Using blocking assignments to represent sequential logic will almost always result
in simulation race conditions and risk synthesis generating hardware implementations
that do not have the intended design functionality. It would seem reasonable for the
SystemVerilog language to make nonblocking assignments in an a l w a y s _ f f or
a l w a y s _ l a t c h procedure a syntax requirement. If blocking assignments were illegal
in these procedures, it would prevent engineers from making coding errors that might
look functionally correct in simulation, but do not synthesize as expected or desired.
There is a reason blocking assignments are allowed, however. While not a recom
mended best-practice coding style, it is possible to mix nonblocking and blocking
assignments in the same procedure. When used correctly this style will simulate cor
rectly and synthesize into a correct implementation.
284 RTL Modeling with SystemVerilog for Simulation and Synthesis
A mix of blocking and non blocking assignments is required in order to code this
behavior as a single always procedure.
a lw a y s _ ff elk) b e g i n : two_steps
@ (p o sed g e
l o g i c [7:0] tmp; // local temporary variable
tmp = a + b; // calculate tmp immediately
out <= c - tmp; // store final result in a register
e n d : two_steps
then tmp would not be updated until the NBA Update region. The subtraction oper
ation in next line of code would always use the previous state of tmp. This simulates
and synthesizes as a pipeline, as in Figure 8-6.
N O TE
A simulation race condition could occur if the value of a temporary variable that is
assigned by a blocking assignment in a sequential logic block is read from outside of
the sequential block. A race condition occurs whenever a variable is assigned by a
blocking assignment on a clock edge, and a concurrent sequential always procedure
reads the value on the same clock edge. The change to the variable value and the read
ing of the variable are both active events, which could be scheduled in any order by a
simulator. Therefore, the procedure that reads the value of the variable might see the
value before it is changed or after it is changed on that clock edge.
Declare temporary variables that are used in a sequential logic block as local
variables within the block.
A local variable declared within a procedural block cannot be read from out side of
the procedure. This will prevent potential simulation race conditions.
The preceding examples follow this guideline. The temporary variable is declared
as a local variable within the procedure (see Chapter 6, section 6.1.2, page 215.
286 RTL Modeling with SystemVerilog for Simulation and Synthesis
At the implementation level, actual flip-flops can be non-resettable (no reset input),
or can have a reset input control. The reset control can be synchronous or asynchro
nous to the clock, and can be an active-high or active-low control. Some flip-flop
devices also have a set (sometimes called a preset) input. There are advantages and
disadvantages to each of these types of flip-flops. These engineering trade-offs are
outside the scope of this book, which focuses on the RTL modeling styles that reflect
these implementation characteristics.
NO TE
Specific target ASIC or FPGA devices might only support one type of reset.
ASICs and FPGAs can differ in which type of reset the device uses, which might
impact RTL modeling style. FPGA devices, in particular, often only have flip-flops
with one type of reset (perhaps synchronous, active-high). In contrast, many ASIC
devices, and some FPGA devices, have both synchronous and asynchronous flip-
flops available. Likewise, some devices only have flip-flops with a reset input,
whereas other devices also have flip-flops with both set and reset inputs.
Write RTL models using a preferred type of reset, and let synthesis compil
ers map the reset functionality to the type of reset supported by the target
ASIC or FPGA. Only write RTL models to use the same type of reset that a
specific target ASIC or FPGA uses if it is necessary in order to achieve the
most optimal speed and area for that specific device.
Many RTL design engineers, including the author of this book, model with a pre
ferred style or reset without concern for what the target device supports. Synthesis
compilers can map any type of reset in an RTL model to any type of reset available in
a target ASIC and FPGA device. For example, if the RTL model uses active-low
resets and the target device only has flip-flops with active-high resets, then a synthesis
compiler will add extra gate-level logic invert the reset used in the RTL model. If the
RTL model uses synchronous resets, and the target device only has flip-flops with
asynchronous resets, then a synthesis compiler will add extra gate-level logic external
to the asynchronous flip-flop to reset it synchronous to the clock. Modem ASICs and
FPGAs have ample speed and capacity. A fully functional design can be obtained
without having to worry about whether the synthesis process had to add some extra
logic to map the RTL style reset to the target device’s type of flip-flop.
Most of the examples in this book are modeled with active-low asynchronous
resets, though some of the smaller code snippets use active-low synchronous resets.
Chapter 8: Modeling Sequential Logic 287
Figure 8-7: Synthesis result: Async reset DFF mapped to Xilinx Virtex®-6 FPGA
Figure 8-8 shows the results of targeting the same generic flip-flop to a device that
does not have synchronous reset flip-flops, a Xilinx CoolRunner™-II CPLD. The
flip-flop’s asynchronous active-high c l r and pre inputs are not used.
Figure 8-8: Synthesis result: Async reset mapped to Xilinx CoolRunner™-!! CPLD
elk
1
1 22 35
rstN j
ll — —
■ '
XFF '
AA J
LL.
00
i
Both simulation and synthesis require that only the leading edge of an asynchro
nous reset be included in the sensitive list — simulation requires this for proper asyn
chronous reset behavior, and synthesis requires it syntactically. The SystemVerilog
syntax does not enforce this restriction. It is an RTL coding style that design engi
neers must follow. Lint checkers can check that this coding style is being followed.
Although either active-high or active-low resets can be used, all RTL models in a
project should be consistent. A mix of reset polarity can lead to code that is difficult to
understand, maintain and reuse.
This book uses the convention of adding a capital “N” to the end of the names of
active-low signals. Another common convention is to append an “_n” to active-low
signal names.
290 RTL Modeling with SystemVerilog for Simulation and Synthesis
Some ASIC and FPGA devices have pre-defined and optimized chip-enable flip-
flops. Synthesis compilers can translate the RTL model of a chip-enable flip-flop to
these pre-defined components, if available.
If the target ASIC or FPGA does not have a chip-enable flip-flop, then synthesis
can implement the chip-enable behavior by adding functionality outside of a flip-flop.
This might be done by adding a multiplexor on the data input that selects either the
new data value or the output of the flip-flop, as illustrated in Figure 8-11.
q
d
enable
elk
rstN
Observe that the clock to the flip-flop is always present. A chip-enable flip-flop
gates the data input to the flop-flop, not the clock input.
Chapter 8: Modeling Sequential Logic 291
NOTE
The priority given to the set or the reset input in an RTL model should match
a specific target ASIC or FPGA device. Some devices give priority to the
reset input, whereas other devices give priority to the set input.
If a set/reset flip-flop behavior is required, write the RTL model priority for set ver
sus reset to match the priority of the specific target device in which the design will be
implemented.
Since not all target devices have the same set/reset priority, it is difficult to write
set/reset flip-flop RTL models that will synthesize optimally for all target devices. If
the target device does not have a set/reset flip-flop with the same priority as the RTL
model, synthesis compilers can add additional logic outside of the flip-flop to make it
match the RTL model. This additional logic can affect device timing, however, and
cause race conditions with other parts of a design when coming out of a reset state.
Set/reset flip-flops also have more exacting setup and hold times for these inputs
that a flip-flop that only has a set or reset input, but not both. Even when the priority
of the set and reset in the RTL model matches the priority of the target device, design
ers need to be careful that the design can meet these setup and hold requirements.
Simulation glitch with set/reset flip-flops. The example shown above of a set-reset
flip-flop RTL model is functionally correct, and will synthesize correctly. There is,
however, a potential simulation glitch with this code. The glitch occurs if setN and
rstN control inputs are both active at the same time, and then rstN becomes inac
tive. In this example, rstN takes priority, and the flip-flop properly resets. When the
292 RTL Modeling with SystemVerilog for Simulation and Synthesis
rstN input becomes inactive, the setN input should take over, and the flip-flop
should switch to its set state. The glitch in simulation is that the RTL model is only
sensitive to the leading edge of rstN — a requirement of synthesis compilers — and
is not sensitive to the trailing edge of rstN. When rstN becomes inactive, the sensi
tivity list will not trigger, and therefore miss setting the flip-flop, even though setN is
still active. This glitch will only last until the next positive edge of clock, which will
trigger the always procedure, and cause the procedure to be reevaluated.
The solution to prevent this simulation glitch is to add the trailing edge of reset to
the sensitivity list. However, just adding the trailing edge of reset would lead to the
same problem described earlier, where the trailing edge of reset could act as a clock
(see section 8.1.5.3, page 288). Therefore, the inactive level of the reset input needs to
be ANDed with the active level of the set input, and the sensitivity list will trigger
when that result becomes true.
The revised sensitivity list to prevent a simulation glitch is:
always_ff @ ( posedge elk
or negedge rstN
or negedge setN
or posedge (rstN & ~setN) // not synthesizable
)
if (IrstN) q <= '0; // reset (active low)
else if (IsetN) q <= '1; // set (active low)
else q <= d; // clock
Triggering on the result of an expression is legal in the SystemVerilog language, but
is not permitted by synthesis compilers. The additional trigger to avoid the simulation
glitch needs to be hidden from synthesis. This can be done using synthesis
translate_off / translate_on pragmas. Synthesis pragmas are special com
ments that begin with the word synthesis. Simulators ignore these comments, but
synthesis compilers act on them. The following snippet adds the translate_of f /
translate_on pragmas.
always_ff @ ( posedge elk
or negedge rstN
or negedge setN
// synthesis translate_off
or posedge (rstN & ~setN) // not synthesizable
// synthesis translate_on
)
if (IrstN) q <= '0; // reset (active low)
else if (IsetN) q <= '1; // set (active low)
else q <= d; // clock
Figure 8-12 shows the results from synthesizing this set-reset flip-flop code.
Chapter 8: Modeling Sequential Logic 293
Observe that the synthesis compiler added additional logic before the set input of
the generic flip-flop in order to enforce that rstN has priority of setN. If the target
ASIC or FPGA device has set/reset flip-flops where reset has priority over set, then
this additional logic will be removed when the generic flip-flop is mapped to that tar
get ASIC or FPGA. Otherwise, the additional logic will be left in so that the ASIC or
FPGA implementation matches the RTL model.
The generic flip-flop that the synthesis compiler used before targeting a specific
ASIC or FPGA has active-high set and reset inputs. Therefore, the synthesis compiler
added inverters to the active-low signals in the RTL model. These inverters will be
removed if the target device has active-low control inputs.
An alternative to using the translate_off and translate_on synthesis prag
mas is to use conditional compilation. Most synthesis compilers have a predefined
SYNTHESIS macro that can be used to conditionally include or exclude code that the
synthesis tool compiles. To exclude the non-synthesizable line in the previous exam
ple, the code would be:
'ifndef SYNTHESIS // compile if not a synthesis compiler
or posedge (rstN & ~setN) // not synthesizable
'endif // end of synthesis exclusion
NOTE
At the time this book was written, one commercial synthesis compiler did
not recognize / / synthesis as a synthesis pragma. That compiler required
that pragmas start with / / pragma or / / synopsys.
294 RTL Modeling with SystemVerilog for Simulation and Synthesis
NOTE
Using initial variable values to model flip-flop power-up is specific to FPGA
devices. ASIC devices do not support this capability.
Synthesis compilers might require special options to enable the use of initial
values in variable declarations. Refer to the documentation of the specific
compiler.
Synthesis compilers, timing analyzers and Clock Domain Crossing (CDC) analysis
tools are more effective when all sequential logic in a module uses the same clock.
When data moves from one clock domain to another, care must be taken to avoid
metastability problems. A metastable condition can occur when the data input to a
flip-flop changes too close to the clock trigger. The flip-flop’s setup time is the
amount of time before the clock in which the inputs must be stable. The hold time is
the amount of time after the clock edge in which the inputs must remain stable.
Setup and hold violations are most likely to occur on signals that cross clock
domains, as the data transfers from the output of one module to an input of another
module that uses a different clock. In multiple clock designs, these modules could be,
and often are, running at different frequencies. There is a risk that inputs to a module
that originated in a different clock domain could change too close to a clock edge in
the current module’s clock domain, resulting in a metastable condition. To avoid the
risk of metastability, synchronizer circuits need to be added to any input ports where
that input originated in a different clock domain.
One common way to pass data vectors from one clock domain to another domain is
to use request and acknowledge handshake control signals. The module sending out
the data issues a request to the receiving module, requesting that the receiving module
read the data bus. The request signal originates in the clock domain of the sending
module, and could potentially arrive close to a clock edge of the receiving module’s
clock domain. To avoid the risk of metastability, the receiving module passes the
296 RTL Modeling with SystemVerilog for Simulation and Synthesis
incoming request through a clock synchronizer, before registering the incoming data.
After the receiving module has registered the data, it sends out an acknowledgement
to the sending module, which the sending module synchronizes to its clock domain.
The sending module holds the data stable until the acknowledge handshake is
received and synchronized.
Single-bit CDC synchronizers are most often implemented with a two-stage shift
register. Figure 8-13 shows a typical synchronizer circuit.
Figure 8-13: Two flip-flop clock synchronizer for 1-bit control signals
An RTL flip-flop is modeled with zero-delays, and has no setup and hold times.
Signals that cross clock domains will appear to always work in RTL models, even
when there is no CDC synchronization. Nonetheless, the RTL models should include
the synchronizer circuits that will be needed in the gate-level implementation of the
RTL model. An example of a 1-bit control line synchronizer is:
always ff @(posedge elk or negedge rstN)
if (IrstN) begin // asynchronous active-low reset
req tmp <= '0;
req synced <= '0;
end
else begin
req tmp <= req; // register req input
req synced <= req // stabilize req
end
ASIC and FPGA devices might have optimized clock synchronizers in their target
libraries. Synthesis compilers will recognize the RTL clock synchronizer behavior
and map this behavior to an appropriate target component, if available.
The proper design of clock domain crossing synchronizers is an engineering topic,
and is outside the scope of this book. Appendix D lists some additional resources on
the topic of clock-domain crossing, metastability, and synchronization.
Chapter 8: Modeling Sequential Logic 297
NOTE
Synthesis compilers ignore intra-assignment delays. This can lead to a mis
match in the RTL simulation behavior and the actual gate-level implementa
tion, especially if the intra-assignment delay, or a series of intra-assignment
delays, are longer than a clock cycle.
A unit delay is a delay of one unit of time ( # 1 ), using the module’s timeunit def
inition (or a ' timescale compiler directive, if there is no local timeunit definition
in the module, or the simulation time units if there is no ' timescale in effect).
This book does not discuss Finite State Machines (FSM) design theory. It is
assumed the reader is already familiar with state machine design. The focus of this
book is on best-practice coding styles for RTL models of FSMs.
Finite State Machines spread a series of operations over multiple clock cycles, often
with decision branches as to which operations are to be executed. A common usage of
state machines is to set various control signals for different conditions as data is being
processed.
Model FSMs in a separate module. (Support logic for the FSM, such as a
counter that is only used by the FSM, can be included in the same module.)
Keeping the FSM code separate from other design logic helps make the FSM easier
to maintain and to reuse in multiple projects. Synthesis compilers and design automa
tion tools that aid in FSM design also work better when the FSM code is in a separate
module, and not mingled with other functionality.
The examples in this section represent the flow of a simplified 8-bit Serial-to-Paral-
lel Interface (SPI). In order to focus on the state machine logic, this simplified SPI
design does not have any control registers, and is not configurable the way a more
complex SPI might be.
The simple SPI has a 1-bit serial_in input that is loaded into an 8-bit register
over 8 clock cycles. On the first clock cycle the serial_in value is loaded into the
most-significant bit (MSB) of the register. On the next cycle, the MSB of the register
is shifted down one bit, and the next serial_in value is loaded into the register’s
MSB. This shift-and-load operation occurs 8 times over 8 clock cycles to load the
serial input stream into the 8-bit parallel register.
The serial_in input stream includes a start-bit to indicate when to start loading
the 8-bit register. The serial_in input is held at 1 when no transfer is occurring.
The first time serial_in is 0 indicates that an 8-bit stream of data will follow. Thus,
the serial_in pattern is 9 bits; a start bit followed by 8 bits of data. Figure 8-14
illustrates the pattern for an 8-bit data value.
Figure 8-14: An 8-bit serial value of hex CA, plus a start bit
serialjn
------------- > 1 1 1 0 1 1 0 0 1 0 1 0 1 1 1
F
start bit
~ T ~
data
300 RTL Modeling with SystemVerilog for Simulation and Synthesis
The simple SPI uses three states to load an 8-bit serial input stream: w a i t e ,l o a d
and r e a d y .(The state name w a i t e is purposely spelled with an “E” at the end to dif
ferentiate it from the SystemVerilog wait keyword. This is not necessary syntacti
cally, because SystemVerilog is a case sensitive language. However, some
engineering tools can be invoked in a case-insensitive mode, and would not see a dif
ference between wait and w a i t . Synthesis compilers can also generate design
netlists case-insensitive languages, such as VHDL or EDIF.)
The state flow for this simple SPI state machine is shown in Figure 8-15:
Figure 8-15: State flow for an 8-bit serial-to-parallel Finite State Machine
WAITE 0 0 0
LOAD 1 1 0
READY 0 0 1
The FSM resets to a r e s e t state. On the next clock cycle after reset is deasserted,
the state machine transitions to a w a i t e state, and then remains in the w a i t e state
until the serial_in input goes to zero, which represents the start bit. The state
machine then transitions to a l o a d state, and remains in that state for 8 clock cycles.
A 3-bit decrement counter is used to control how long the FSM stays in the l o a d
state. When the counter reaches a count of 0, the state machine transitions to a r e a d y
state. On the next clock cycle, the FSM then transitions back to the w a i t e state,
where it remains until the next start bit is detected.
The FSM in this simple SPI sets three control signals as the FSM outputs:
• cntr_rstN is used to hold the 3-bit decrement counter in a reset state (a full count
of 7).
• shif t_en is used to enable an 8-bit shift register. When shift_en is 1, the values
in the register are shifted down 1 bit, and a new value is loaded into the most-signif
icant bit of the register.
• data_rdy is set to 1 when the 8-bit parallel register has been loaded.
Chapter 8: Modeling Sequential Logic 301
The table shown in Figure 8-15 shows the values for the control signals in each
state of the FSM.
The full code and resulting synthesis schematic for this simple SPI state machine is
shown later in this chapter, in section 8.2.4 (page 309), after various aspects of Finite
State Machine modeling have been examined.
There are two primary architectures used for most ASIC and FPGA state machine
designs: Mealy and Moore (named after George H. Mealy and Edward F. Moore,
respectively). The primary difference in these architectures is when outputs from the
state machine can change relative to changes in the state of the FSM. With a Moore
architecture, outputs values are based solely on the current state of the FSM. Thus, the
output can only change values when the state changes. In a Mealy architecture, the
output values are based on a combination of the current state and other inputs to the
state machine. Thus, the outputs can change asynchronous to when state changes.
At the abstract RTL modeling level, Mealy and Moore architectures are represented
by the decision statements that set the FSM outputs. If only the state variable is used
to determine the output values, then the behavior represents a Moore architecture. An
example of this output decoding is:
always_comb begin
case (state)
RESET: begin
cntr_rstN = '0; shift_en = '0; data_rdy = '0;
end
WAITE: begin ... end
LOAD : begin ... end
READY: begin
cntr_rstN = '0; shift_en = '0; data_rdy = '1;
end
endcase
end
If the state variable and other signals are used to determine the outputs, then the
state machine behaves as a Mealy architecture.
The following example represents a Mealy architecture because it decodes the
value of state and, when in the READY state, also decodes a data_valid signal to
set the output controls.
302 RTL Modeling with SystemVerilog for Simulation and Synthesis
always_comb begin
case (state)
RESET: begin
cntr rstN = '0; shift en = '0; data rdy = '0;
end
WAITE: begin ... end
LOAD : begin ... end
READY: begin
cntr rstN = '0; shift en = '0;
if (data valid) data rdy = '1;
else data rdy = ’0;
end
endcase
end
The states in a Finite State Machine are represented by encoded values. There are
many different codes that can be used, such as: binary count, one-hot, one-hot-0, Gray
code (named after Frank Gray), and Johnson count (named after Dr. Robert Royce
Johnson). The advantages and appropriate times to use each each encoding style is a
design engineering topic that is beyond the scope of this book. Once that choice has
been made, however, it can be reflected in the RTL model of the state machine. Enu
merated type labels can be defined to with specific values to represent the encoding
values. Several examples follow:
// default binary encoding (RESET=0, WAITE=1, LOAD=2, ...)
typedef enum logic [1:0] {RESET,WAITE,LOAD,READY} states_t;
// one-hot encoding
typedef enum logic [3:0] {RESET = 4'b0001,
WAITE = 4'b0010,
LOAD = 4'bOlOO,
READY = 4'bl000
} states_onehot_t;
// one-hot-0 encoding
typedef enum logic [2:0] {RESET = 3'b000,
WAITE = 3'b001,
LOAD = 3'bOlO,
READY = 3'bl00
} states onehotO t;
Chapter 8: Modeling Sequential Logic 303
Define a logic (4-state) base type and vector size for enumerated variables.
The SystemVerilog enumerated type has a default base data type of int. This is a
32-bit 2-state data type, which can hide design bugs in simulation that would show up
as an X value with a 4-state logic type. Although synthesis compilers will optimize
out any unused bits of the 32-bit default vector size, it is a better coding style to be
explicit for the vector size needed for the encoded values.
Enumerated types versus parameter constants. The legacy Verilog language did
not have enumerated types. Instead, state values were encoded by using parameters
(either the parameter or localparam keyword). For example:
// one-hot encoding
localparam [3:0] RESET = 4'bOOOl,
WAITE = 4'bOOlO,
LOAD = 4'bOlOO,
READY = 4'blOOO;
Use enumerated variables for FSM state variables. Do not use parameters
and loosely typed variables for state variables.
304 RTL Modeling with SystemVerilog for Simulation and Synthesis
Enumerated variables have strongly typed assignment rules that can prevent com
mon coding mistakes.
While the appearance and functionality of parameters versus enumerated type
labels are similar, the constructs have very different language rules. When parameters
are used, the state variables will be a simple variable type, such as:
logic [3:0] state, next;
SystemVerilog variables are loosely typed, meaning a value of a different type or
size can be assigned to the variable, and an implicit cast conversion will occur (see
Chapter 5, section 5.15, page 198). This implicit conversion can lead to a number of
programming gotchas when modeling state machines. The gotchas will compile and
simulate, but can have functional bugs. These functional bugs can be subtle. At best,
they impact the design schedule because the bugs need to be detected, debugged, cor
rected, and the design reverified. It is possible, however, that a subtle bug can go
undetected, and affect the gate-level implementation of the design.
When enumerated types are defined, the state variables can be declared as that user-
defined type. For example:
states_t state, next;
Enumerated type variables are more strongly typed. The definition of an enumer
ated type cannot have size mismatches or duplicate values. Assignments to enumer
ated type variables must be one of the defined labels for that enumerated type, or
another enumerated variable from the same enumerated definition. Chapter 4, section
4.4.3 (page 118) discusses enumerated type assignment rules in more detail.
Synthesizing state encoding. Synthesis compilers will recognize the encoding that
is defined in the RTL model, and, by default, use that encoding in the gate-level
implementation. One-hot encoding might require additional information in order to
achieve best synthesis Quality of Results (QoR). This is discussed in section 8.2.5
(page 313). Most synthesis compilers are configurable, and can be directed to use the
RTL encoding in the gate-level implementation, or to choose an alternate encoding.
This flexibility allows experimenting with different encoding schemes during synthe
sis to find the implementation that is best for the target ASIC or FPGA, and best
meets design area, speed and power goals.
Most verification is done at the RTL level. Choosing the encoding scheme early in
the design process:
• Ensures that the design is verified with that encoding scheme used in the gate-level
implementation.
• Allows the use of a Logic Equivalence Checker — a design tool that statically com
pares the boolean functionality of two versions of a design — to compare the state
decoding logic of the RTL model and the gate-level implementation.
A three-process state machine model is simple to model and maintain, and will usu
ally produce good synthesis Quality of Results (QoR). While it is possible to model a
Finite State Machine with just one or two processes, doing so can make the RTL code
harder to write, harder to debug, and harder to reuse. There can be exceptions to this
guideline, where a two-process state machine model can be advantageous. Section
8.2.3.2 (page 307) discusses this situation.
Following is an example three separate process Finite State Machine, modeled as
incomplete pseudocode.
// Current State Logic -- sequential logic
always_ff @(posedge elk or negedge rstN)
if (!rstN)
state <= RESET;
else
state <= next state;
• The separate processes are more reusable. The code for each processes can be cop
ied into a different state machine design, and each block more easily modified for
use in the new design.
• The state sequencer code, in particular, will be almost exactly the same in every
state machine model.
• The next state and output combinational blocks can be sensitive to different inputs.
Modeling a Moore FSM architecture almost always requires that separate proce
dural blocks be used.
For most designs, there is no disadvantage to not using three separate processes to
model the three main functional blocks of a state machine. An exception might arise,
however, if both the next state decoding and the output decoding have complex com
binational logic that requires similar algorithms. In that situation, it might be more
advantageous to combine these two combinational logic procedures, as discussed
next, in section, 8.2.3.2.
Another rare exception to the advantages of a three-process state machine can occur
if an state machine output is to be registered on the same clock cycle in which a state
is entered, rather than after the new state has been stored in the state flip-flops. In this
situation, some of the next state decoding might be moved into the state sequencer
procedure.
NOTE
FSM outputs coming directly from combinational logic can have glitches.
An important principle of hardware design needs to be remembered and
applied when designing Finite State Machines — non-registered signals can
have glitches between clock cycles. State machine outputs can be stored in
clocked registers to make the outputs more stable.
functionality. If, therefore, an input to the next state decoder changes, the output
decoder must also be evaluated. Combining the next state and output decoders into a
single process will most likely require using Mealy architecture. A second disadvan
tage is code maintenance and debugging. The single combinational logic procedure
will contain intermingled lines of code for calculating the next state and the output
values. Changes to one algorithm can inadvertently affect the other functional block.
A third disadvantage is that the combined functionality might be more difficult to
reuse in other projects because more project-specific signals and algorithms are
lumped into the single procedure.
WAITE: begin
if (serial_in == '0) begin
state <= ...
fsm_outputs <= ...
end
else begin
state <= ...
fsm_outputs <= ...
end
end
LOAD: begin
if (downcount != '0) begin
state <= ...
fsm_outputs <= ...
end
else begin
state <= ...
fsm_outputs <= ...
end
end
READY: begin
state <= ...
fsm_outputs <= ...
end
endcase
end
There are engineers, often those who come from a VHDL modeling background,
who are adamant that a single process coding style is a preferred coding style. In Sys
temVerilog, however, a one process state machine has many disadvantages. Chief
among these disadvantages is that the RTL simulation does not accurately model the
gate-level implementation. At the gate level, combinational logic outputs update
whenever an input value changes. In the one-process RTL simulation, however, the
combinational logic mixed in the sequential logic block is only evaluated on a clock
edge. Another disadvantage is that, because the single process has all the FSM func
tionality jumbled together, this modeling style can be difficult to debug if a functional
bug is found during RTL simulation. For the same reason, the one process coding
style is also difficult to reuse in other projects, without having to rewrite a lot of the
code within the one procedure.
Figure 8-17: Functional block diagram for a serial-to-parallel finite state machine
//////////////////////////////////////////////////////////
// 4-state State Machine with async active-low reset
//////////////////////////////////////////////////////////
//////////////////////////////////////////////////////////
//8-bit shift register with enable, async active-low reset
//////////////////////////////////////////////////////////
always_ff 0 (posedge elk or negedge rstN)
if (!rstN)
data <= '0;
else if (shift_en)
data <= {serial in, data[7:1]};
312 RTL Modeling with SystemVerilog for Simulation and Synthesis
//////////////////////////////////////////////////////////
// 3-bit Decrement Counter with async active-low reset
//////////////////////////////////////////////////////////
always_ff 0 (posedge elk) // synchronous active-low reset
if (!cntr_rstN)
downcount <= '1; // reset to full count
else
downcount <= downcount - 1; // decrement counter
NOTE
Proper usage of blocking and nonblocking assignments is critical for obtain
ing proper, race-free simulation behavior. Use blocking assignments for all
combinational logic assignments. Use nonblocking assignments for all
sequential logic assignments. Correct usage of blocking and nonblocking
assignments is discussed in Chapter 1, section 1.5.3 (page 23), Chapter 7,
section 7.2.4 (page 261) and in section 8.1.4 (page 278) of this chapter.
Figure 8-18: Synthesis result for Example 8-3: Simple-SPI using a state machine
always_comb begin
unique case (state)
RESET: next = WAITE;
WAITE: next = (serial_in == '0)7 WAITE : LOAD;
LOAD: next = (downcount == '0)7 LOAD : READY;
READY: next = WAITE;
endcase
end
NOTE
Synthesis compilers might infer a multi-bit comparator when a multi-bit case
expression is compared to a multi-bit case item. This is not an efficient gate-
level implementation of one-hot state machine encoding.
Since only one bit is set (“hot”) in one-hot encoding, only 1-bit comparators are
needed to determine which bit is set. The example above, however, is modeled as a 4-
bit comparator, and might synthesize as just that — 4-bit gate-level comparators. Syn
thesis compilers might require the addition of a pragma or configuration setting to
instruct the compiler to optimize the implementation as one-bit comparators.
always_comb begin
unique case (l'bl)
state[0]: next = WAITE;
state[l]: next = (serial_in == '0)? WAITE : LOAD;
state[2]: next = (downcount == '0)? LOAD : READY;
state[3]: next = WAITE;
endcase
end
By reversing the case statement, all that is required at the gate-level is a 1-bit com
parator to compare the case expression and each case item, as opposed to when the
full state vector is compared to all bits of each case item. Synthesis compilers might
yield more optimized synthesis results using the reversed case style for one-hot state
machines, than the standard style of case statements.
One drawback of the reverse case statement shown above is that the code is not
self-documenting. A case item such as state [ 0 ]: does not make it obvious that this
is the r e s e t state. Reverse case statements can be made more readable by using con
stants to define names for the index numbers of the one-hot bits.
One way of labeling the state bits is shown in the following example. This example
is simple and easy to read, but has limitations, which are discussed after the example.
localparam RESET = 0, // index of RESET one-hot bit
WAITE = 1, // index of WAITE one-hot bit
LOAD = 2, // index of LOAD one-hot bit
READY = 3; // index of READY one-hot bit
always_comb begin
next = '0; // clear all bits in next
unique case (l'bl) // set the bit representing next state
state[RESET]: next[WAITE] = '1;
state[WAITE]: if (serial_in == 0) next[WAITE] = 'l;
else next[LOAD ] = 'l;
state[LOAD ]: if (downcount == 0) next[LOAD ] = 'l;
else next[READY] = 'l;
state[READY]: next[WAITE] = 'l;
endcase
end
In the preceding example, state and next are loosely-typed 4-state variables,
instead of enumerated variables. This means the strongly-typed protections of enu
merated types have been lost (see Chapter 4, section 4.4.3, page 118). A better coding
style combines using enumerated types for the state variables and using constant
names for the one-hot bits. This style also allows the definition of the one-hot values
for each state to be derived from the definition for the state bits.
localparam RESET _BIT = 0, // index of RESET one-hot bit
WAITE__BIT = 1, // index of WAITE one-hot bit
LOAD_ BIT = 2, // index of LOAD one-hot bit
READY BIT = 3; // index of READY one-hot bit
always_comb begin
unique case (l'bl)
state[RESET_BIT]: next WAITE;
state[WAITE_BIT]: next (serial_in: '0)? WAITE : LOAD;
state[LOAD_BIT ]: next (downcount '0)? LOAD : READY;
state[READY BIT]: next WAITE;
endcase
end
The value for each state label is calculated by shifting a value of 1 (decimal) to the
bit position that is “hot” for that state. In this example, the enumerated variable is 4
bits wide. A value of 0001 shifted 0 times (the value of r e s e t _ b i t ) is 0001
(binary). A 0001 shifted 1 time (the value of w a i t e _ b i t ) is 0010 (binary), and
shifted 2 times (the value of l o a d _ b i t ) is 0100 (binary), and shifted 3 times (the
value of r e s e t _ b i t ) is 1000 (binary).
316 RTL Modeling with SystemVerilog for Simulation and Synthesis
Basing the enumerated state labels off of the state bit constants means:
• There is no possibility of a coding error that defines different one-hot bit positions
in the local parameter and the enumerated type definitions.
• Should the design specification change the one-hot definitions, only the local
parameters specifying the bit positions have to change. The enumerated type defin
ing the state names will automatically reflect the change.
Using unique case to Optimize reverse case statements. All of the reverse case
statements above used the keyword pair unique case. This decision modifier is
necessary in these examples to help ensure optimal synthesis results and better design
verification in simulation. SystemVerilog language rules require that case items are
evaluated in the order in which they are listed. This rule means that each case item
takes priority over all subsequent case items. This priority encoded behavior requires
more logic gates and longer propagation paths than a simple parallel decoder.
Synthesis compilers will analyze case statements to see if all case item values are
unique, meaning no two case items have the same value. If synthesis can determine
that it is impossible for two case items to be true at the same time, synthesis compilers
will automatically remove the priority encoding, and evaluate the case items in paral
lel.
With a reverse case statement, however, the case items are not literal values that a
synthesis compiler can evaluate for having unique values. Instead, the case items are
bits of the state variable that are set and changed during simulation. Because the case
items are variable, synthesis compilers cannot determine that all case items have
unique values. Therefore, synthesis compilers will implement this reverse case state
ment one-hot decoder with priority encoded logic, and will not automatically optize
the case statement decoding for parallel evaluation.
The unique decision modifier tells synthesis compilers to treat the case items as
unique values, even when the compiler cannot determine this on its own. Synthesis
will optimize the gate-level implementation to have parallel decoding, instead of pri
ority encoded logic.
The unique decision modifier also has an important effect on simulation by
enabling two dynamic checks during simulation. A run-time warning is issued if the
case statement is entered and two or more case items are true at the same time. Thus,
simulation will catch any design bugs that result in the state variable having two bits
set at the time. A run-time warning is also issued if the case statement is entered and
no case items are true. This can help detect design bugs, such as the state variable
being reset to 0, instead of a one-hot value.
SystemVerilog also has a uniqueO decision modifier. Like unique, this modifier
informs synthesis compilers to assume all case items are mutually exclusive, and to
use parallel decoding instead of priority encoded logic. However, the uniqueO modi
fier only enables run-time checking that multiple case items are never true at the same
time. The modifier does not enable checking for no case items being true. The
Chapter 8: Modeling Sequential Logic 317
unique modifier is more appropriate for reverse case statements because it enables
run-time checking for only one case item matching as well as for no case items
matching.
Chapter 7, section 7.4.2 (page 266) discusses the unique and uniqueO decision
modifiers in more detail.
A register, which is most often made from flip-flops, stores a single value. A collec
tion of several registers can be used to store multiple values. Designs often need large
blocks of storage as well, such as a program memory or data memory. Using flip-flop
based registers is not a practical way to implement these large blocks of memory stor
age at the gate level. Instead, memory components, such as RAM (Random Access
Memory) devices, are used for this type of storage.
ASICs and FPGAs have pre-defined and pre-optimized memory components for
larger blocks of storage. Since these are predefined in the ASIC or FPGA library, they
are not modeled at the RTL level and are not synthesized.
RTL models of a design that access memory devices need to instantiate behavioral
models of RAMs in order to fully verify the RTL functionality of the rest of the
design. These behavioral RAM models are not synthesized. Instead, after the RTL
design has been synthesized to a gate-level implementation, the instance of the behav
318 RTL Modeling with SystemVerilog for Simulation and Synthesis
ioral RAM module can be replaced with an instance of the optimized memory device
from the target ASIC or FPGA library.
endmodule: RAM
The storage in the RAM model is represented by the one-dimensional array of 8-bit
variables. The rest of the functionality in the model is the logic to read a value from a
specific address of the array, or to write a value into a specific address of the array.
This RAM example has three control inputs in addition to the primary data and
address inputs. All three control inputs are active-low. The ncs (not chip select) must
be active (0) in order to write to, or read from, the RAM. The nwr (not write) control
is active when writing to the RAM, and nrd (not read) is active when reading from
the RAM.
The data port is bidirectional, and is used as an input when writing into the RAM
and an output when reading from the RAM. The RAM drives data as an output when
both nrd and ncs are active (0). The RAM tri-states the data bus, allowing an exter
nal driver to put values on the bus, when either the RAM is not selected (ncs is high),
or when it is not being read (nrd is high).
SystemVerilog syntax requires that bidirectional ports be declared as a net data type
such as wire or tri. The example above declares data as a logic type, which can
have 4-state values, but does not declare a data type for data. When no data type is
specified for a module port, SystemVerilog infers a wire data type for input and inout
ports, and a var variable data type for output ports. This implicit data type inference
is correct for this RAM model.
320 RTL Modeling with SystemVerilog for Simulation and Synthesis
Synchronous memory models. Synchronous RAMs store values, and read back val
ues, on a clock edge. Synchronous RAMs behave similarly to flip-flops, and are mod
eled in a similar way.
module SRAM
(inout logic [7:0] data, // bidirectional port
input logic [7:0] addr,
input logic elk,
nrd, // active-low read control
nwr, // active-low write control
ncs // active-low chip select
);
logic [7:0] mem [0:255];
endmodule: SRAM
At the abstract behavioral level of modeling, the only difference between asynchro
nous and synchronous memories is the sensitivity list of the always procedure.
Observe that the two RAM models in the preceding examples use the general pur
pose always procedure instead of the RTL-specific always_f f or always_latch
procedures. The RTL-specific procedures enforce coding rules for synthesis, one of
which is that the variables assigned in the procedure cannot be assigned from any
other source. Abstract memory models are not intended to be synthesized, and do not
need to adhere to these synthesis rules. Indeed, enforcing synthesis rules would limit
the usefulness of these abstract behavioral models. It is common to load memory
models from outside of the always procedure, such as for a testbench to load a pro
gram into a RAM model. The general purpose always procedure permits this exter
nal loading of the memory array, whereas an always_ff or always_latch
procedure would prohibit it.
• "file" is the name of the pattern file, specified in quotation marks. The
file name string can be a simple file name, or can include a relative or full
directory path. By default, System Verilog searches for this file in the same
operating system directory from which simulation was invoked. Simula
tors might provide ways to change this default search location.
• array_name is the name of the memory array in which the patterns are to
be loaded. Readmem tasks are typically called from within verification
code, and not from within the memory module containing the array.
Therefore, the array name is typically specified with a full module
instance hierarchy path.
• start_address specifies the address of the array into which the first pat
tern should be loaded. Each subsequent pattern is loaded into each subse
quent array address, unless a new address is specified in the pattern file.
The start address argument is optional. If it is not specified, the first pat
tern in the file is loaded into the lowest address number of the array.
• end_address specifies where to stop loading the array. This argument is
also optional. If not specified, the task continues loading the array until
either the last address of the array or the end of the pattern file is reached.
322 RTL Modeling with SystemVerilog for Simulation and Synthesis
Following is an example of using a readmem task to load one of the RAM examples
shown earlier in this section.
initial begin
$readmemb("boot_program.txt", top.chip.rami.mem);
end
Although initial procedures are generally not synthesizable, synthesis compilers
recognize this specialize usage of an initial procedure to load a memory array.
8.4 Summary
This chapter has explored best practice coding styles for modeling synthesizable
flip-flops, registers and Finite State Machines. Abstract, non-synthesizable behavioral
models of memory devices such as RAMs were also discussed.
Two important coding practices should be followed when writing RTL models of
sequential logic:
• U s e the RTL-specific always_f f procedure
• U s e nonblocking assignments.
Adhering to these practices must be done by the engineer writing the RTL code.
The SystemVerilog language does not mandate these important coding styles.
Sequential devices can have many different ways of being reset. This chapter has
explored the proper coding styles and best-practice considerations for modeling and
syntheszing synchronous, asynchronous, active-high, and active-low resets. Avoiding
potential simulation glitches with set/reset flip-flops has also been examined.
Finite State Machines have three major parts: a state sequencer, next state decoding,
and output value decoding. While there are a variety of possible ways to code state
machines, the most advantageous style is to use 3 separate always procedures to rep
resent the three major parts of the state machine.
323
Chapter 9
Modeling Latches and Avoiding
Unintentional Latches
Abstract — This chapter presents best coding practice recommendations for model
ing latches in synthesizable RTL designs. The use of latches in ASIC and FPGA
designs is an oft-debated engineering topic. This book is neutral on this debate. The
purpose of this book is to show how to properly model latch behavior, should the
engineering decision be made to utilize latches in a project.
A related topic is avoiding unintentional latches. This chapter discusses the coding
styles that might infer latches where none are wanted, and several coding styles to
avoid inferring latches. The pros and cons of these coding styles are presented, and
best-practice coding styles recommended.
The topics presented in this chapter include:
• Proper RTL coding styles for representing latch behavior
• RTL code that infers latches when none are intended
• Modeling full (complete) decision statements
• The SystemVerilog unique, uniqueO and priority decision modifiers
• The obsolete X value assignment, and its disadvantages
• The obsolete full_case and parallel_case synthesis pragmas
There are several types of latches in digital circuitry, with the most common types
being SR latches (also called set/reset latches) and transparent latches (also called D-
type latches). The reasons for using — or not using — latches in a design is a general
digital engineering topic, and beyond the scope of this book. Suffice it to say that the
Static Timing Analysis (STA) and Design for Test (DFT) tools used in the back-end
steps of designing ASICs or FPGAs work well with synchronous flip-flops, but using
these tools with latch-based designs is more difficult. Many ASIC and FPGA design
ers avoid the use of latches to simplify the STA and DFT steps of a design flow.
324 RTL Modeling with SystemVerilog for Simulation and Synthesis
Most ASIC and FPGA technologies support the use of transparent D-type latches.
This section discuss best-practice guidelines for modeling this type of latch, should
the choice be made to have latches in the design.
From an RTL modeling perspective, a latch is a cross of combinational logic and
sequential logic. Latches do not have a clock, and do not change on a positive or neg
ative edge transition. With latches, the output value is based on the values of the
inputs, which is the behavior of combinational logic. However, latches also have stor
age characteristics. The output value is a reflection of both the input values and the
state of the internal storage, which is the behavior of sequential logic.
Transparent latch behavior can be modeled by using either the general purpose
always procedure or an RTL-specific always_latch procedure. The sensitivity list
for transparent latches is identical to the sensitivity list for combinational logic. It
must contain all signals that are read within the procedure.
The same synthesis restrictions and best practice guidelines for combinational logic
procedures also apply to latch procedures: The body of the procedure should not con
tain any form of propagation delay (no #, 0 or wait time delays), and no other proce
dural block or continuous assignment can make assignments to the same variables
used on the left-hand side of the latch procedure.
Though not recommended for RTL modeling, properly using the general purpose
always procedure for modeling latch-based logic is discussed briefly, because it is
common to see this general purpose procedure in legacy Verilog models.
When using the general purpose always procedure, the sensitivity list must be
explicitly specified, or inferred using an 0* sensitivity list, in the same way as with a
Chapter 9: Modeling Latches and Avoiding Unintentional Latches 325
combinational logic always procedure. The following code snippet illustrates a sim
ple transparent latch modeled with the general purpose always procedure.
logic [7:0] in, out; // 8-bit variables
logic ena; // scalar (1-bit) variable
A disadvantage of using always to model latches. There are other ways in which a
general purpose always procedure can represent latches A latch will be inferred
whenever a non-clocked always procedure triggers, and there is a possibility that a
variable that is an output of the procedure is not updated. The following simple exam
ple has an if statement with an else branch, but each branch updates a different
variable.
always 0* begin // inferred combinational logic sensitivity
if (sel) yl = in;
else y2 = in;
end
The variable that is not updated in each branch will store its previous value. Synthe
sis will infer latches for both the yl and y2 variables in this example.
This example shows a problem with using the general purpose always procedure
to model latches. There is no way for software tools, or other engineers, to know if a
latch was intended. Since the transparent latch sensitivity list is identical to a combi
national logic sensitivity list, it might be that the RTL designer’s intent was to model
combinational logic, and just inadvertently left out code to assign to all variables each
time the always procedure is entered.
always_latch begin
if (clkl) // transparent when clkl is high
tmp <= a * b;
end // latched when clkl is low
always_latch begin
if (clk2) // transparent when clk2 is high
out <= tmp * c;
end // latched when clk2 is low
endmodule: latch pipeline
Chapter 9: Modeling Latches and Avoiding Unintentional Latches 327
Figure 9-1 shows the results from synthesizing Example 9-1. The latched output of
each multiplier is evident.
Figure 9-1: Synthesis result for Example 9-1: Pipeline with intentional latches
The latch symbols in this schematic are generic latch symbols that could represent
any type of latch. When the synthesis compiler maps this schematic to a specific tar
get ASIC or FPGA, an appropriate device-specific latch will be selected from the
latch types available in that device.
NOTE
Synthesis will infer a latch whenever a non-clocked always procedure is
entered, and there is a possibility that one or more of the variables used on
the left-hand side of assignment statements will not be updated.
Inadvertent latches in state machine models. State machines where the number of
states is not a power of 2 do not use all the bits of the state variable, and therefore
have the potential of the case statement evaluating, and no branch being executed.
The output variables for the procedure are not updated, and retain their previous
value. A 5-state FSM will have at least 3 unused values.
always_comb begin
case ( c u r r e n t _ s t a t e ) // 5 states, b in a ry count encoding
3' bOOO: control_bus = 4'b0000;
3'bOOl: control_bus = 4'bl010;
3'bO ll: control_bus = 4 'b lllO ;
3'blOO: control_bus = 4'b0110;
3'blO l: control_bus = 4'b0101;
endcase
end
State machines that use one-hot encoding might also infer latches. When the always
procedure triggers, if the state variable has a non one-hot value (no bits set or multiple
bits set), then no branch will be executed. For example:
Chapter 9: Modeling Latches and Avoiding Unintentional Latches 329
always_comb begin
case (l'bl) // 5 states, one-hot encoding
c u r r e n t _ s t a t e [0] control_bus = 4'bOllO;
c u r r e n t _ s t a t e [1] control_bus = 4'blOlO;
c u r r e n t _ s t a t e [2] control_bus = 4'b lllO ;
c u r r e n t _ s t a t e [3] control_bus = 4'bOllO;
c u r r e n t _ s t a t e [4] c o n t r o l bus = 4'bOlOl;
endcase
end
In both of these previous examples, synthesis compilers will add latches to match
the simulation behavior of value retention. Synthesis compilers are doing the right
thing. Inserting gate-level latches ensures that the gate-level ASIC or FPGA behavior
is the same as the RTL behavior that was verified in simulation.
There are times when the design engineer knows (or at least assumes) something
about the design that the synthesis tool cannot see by examining the decision state
ment. In the 3-to-l MUX example, the designer may know (or assume) that the design
will never generate a select value of 2 ' b l l and, therefore, the stored value of the y
variable will never be needed. In the one-hot state decoder example, the designer may
know (or assume) that the design will never produce a non one-hot value and, there
fore, the c o n t r o l _ b u s will never need to retain a previous value.
A synthesis compiler can only see that not all possible values of the case expression
were decoded. Synthesis sees a potential of the decision statement being evaluated,
and no branch taken. This would result in the variables assigned in the decision state
ment not being updated.
When an incomplete decision statement is appropriate for the design functionality,
the design engineer needs to let synthesis compilers know that the unspecified deci
sion expression values can be ignored. There are several ways to tell synthesis that all
values used by the decision statement have been specified, and, therefore, latches are
not needed. Five common coding styles are:
1. Use a default case item within the case statement that assigns known output
values (discussed in section 9.3.3, page 335).
2. Use a pre-case assignment before the case statement that assigns known output
values (see section 9.3.4, page 338).
3. Use the unique and priority decision modifiers (section 9.3.5, page 340).
4. Use the obsolete — and dangerous — f u l l _ c a s e synthesis pragma (section
9.3.6, page 345).
5. Use an X assignment value to indicate “don’t care” conditions (section 9.3.6,
page 345).
330 RTL Modeling with SystemVerilog for Simulation and Synthesis
Each coding style has advantages and disadvantages. These styles and their pros
and cons are discussed in the following subsections.
Best synthesis Quality of Results (QoR) will most often be achieved using either
coding style 1 (a default case item within the case statement that assigns known val
ues), or coding style 2 (a pre-case assignment before the case statement that assigns
known values).
Coding guidelines that use logic reduction coding styles for decision statements are
based on old-school techniques that were a best-practice coding style many years ago,
but are seldom necessary today.
Latch avoidance coding styles 1 and 2 both fully implement decision statements,
which is a better style for most designs targeting modem ASICs and FPGAs. These
two styles are discussed in sections 9.3.3 (page 335) and 9.3.4 (page 338), respec
tively. The choice between these styles is mostly a personal preference.
The other three latch-avoidance coding styles use logic minimization techniques,
which have potential risks that might lead to ASIC or FPGA implementations that are
not as robust. The logic reduction coding styles should only be used when gate counts
need to be reduced in order to fit the design in a specific target device, or to meet flip-
flop setup times in a critical timing path.
A combinational logic block is used to decode the current state of the state machine
in order to determine the next state. The state values use a partial 3-bit Johnson Count
encoding, where a value of 1 is shifted into the state register for each subsequent state.
The code for the next state decoder is:
Chapter 9: Modeling Latches and Avoiding Unintentional Latches 333
always_comb begin
case (current state)
RESET next_state READY;
READY next_state SET;
SET next_state GO;
GO next state READY;
endcase
end
This next state decoder functions correctly in simulation, but synthesis will infer
latches. Synthesis sees current_state as a 3-bit vector, which can have 8 possible
values, but the case statement only decodes 4 of those 8 values. Synthesis infers
latches for the next_state bits because of the possibility that the always procedure
could be entered, and current_state having a value other than the 4 values that are
decoded. If this should occur, the next_state variable would not be updated.
NOTE
Some synthesis compilers report latch inference in an always_comb proce
dure as a warning. This is should be an error! The design engineer has indi
cated an intent to have combinational logic. If a latch is inferred, there is a
mistake in the code. Synthesis compilation should abort with an error, rather
than issue a warning that might be overlooked or treated as non-critical.
Had this next state decoder been modeled using the traditional Verilog general pur
pose always procedure — which can be used to model either combinational logic
and latch logic — synthesis compilers might assume the designer intended to model
latch behavior, and not generate a warning message to indicate that latched logic was
inferred. (Some synthesis compilers might have an option to enable latch inference
warnings from the general purpose always procedure.)
334 RTL Modeling with SystemVerilog for Simulation and Synthesis
Simple FSM code and synthesis results. Example 9-2 shows the full context for
this simple state machine. Figure 9-3 shows the results of synthesizing this example.
Observe the latches in the schematic that were inferred for the next_state bits.
Example 9-2: Simple round-robin state machine that will infer latches
module simple_fsm
(input logic elk, rstN,
output logic get_ready, get_set, get_going
);
// state sequencer
always_ff 0 (posedge elk or negedge rstN) // async reset
if (IrstN) current_state <= RESET; // active-low reset
else current_state <= next state;
Figure 9-3: Synthesis result for Example 9-2: FSM with unintended latches
The simple FSM in Example 9-2 is functionally correct — the RTF works as
intended, and the synthesized implementation matches the RTF functionality. The
state encoding does not use all possible values of the current_state vector, and
therefore the next state decoder should not need to decode these unused values.
The following sections present several ways to let synthesis compilers know that
some current_state values are not used, and therefore no latches should be
inferred.
9.3.3 Latch avoidance style 1 — D efau lt case item with known values
For the next state decoder in Example 9-2, one of the next_state values can be
used as a default value, should a value of current_state occur that is not expected.
always_comb begin
case (current_state)
RESET : next_state = READY;
READY : next_state = SET;
SET : next_state = GO;
GO : next_state = READY;
default: next_state = RESET; // reset if error
endcase
end
Figure 9-4 shows the results of synthesizing the same simple state machine, but
with the default case item shown in the code snippet above. Observe that no latches
are inferred for next state.
Figure 9-4: Synthesis result when using a default case item to prevent latches
Pros and cons of using a default case item with case statements. Using a case
default branch ensures that all values of the case expression are decoded — even val
ues that should never occur. Should an unexpected case expression value occur, a
defined action will be taken. The gate-level implementation will be robust, and can
handle circumstances that were not anticipated in the RTL model. In the Johnson
Count state encoding example used in this section, current_state should never
Chapter 9: Modeling Latches and Avoiding Unintentional Latches 337
have values of 010, 100, 101 or 110. But unexpected values can sometimes occur in
actual silicon. A chip might power up with a state value that is not used, or some EMF
interference could cause a glitch that results in a momentary state value that is not
used. The coding style of using a default branch that assigns a known value means
these unexpected conditions are decoded and handled in a defined way.
One disadvantage of using a default branch that assigns a known value is that this
style only addresses one cause of a latch being inferred — an incomplete case state
ment. A latch could still be inferred if every branch of the case statement, including
the default branch, does not assign to the same variables.
A second disadvantage of using a default branch that assigns a known value is that
extra gate-level circuitry is required to decode the case expression values that are not
used by the design. A 16-state one-hot state machine requires a 16-bit vector, which
has 65,536 possible values (216), of which only 16 are needed by the design. Decod
ing the remaining 65,520 values can require many more logic gates, which, if every
thing is working as expected, will never be used. These additional gates could have a
negative effect on the area, speed, and power consumption of the ASIC or FPGA.
These extra gates are usually not a problem. Most modem ASICs and FPGAs have
ample capacity and speed to handle the extra gates. Furthermore, many synthesis
compilers have special FSM optimization algorithms that perform a reachability anal
ysis, and minimize the effects of the extra logic needed to decode all possible values
of a case expression.
N O TE
A default branch does not guarantee that latches will not be inferred. Even
with a fully-specified case statement, a latch might be inferred if a non-
clocked procedural block is entered, and there is a possibility that one or
more variables will not be updated. If the combinational logic procedure
does not have any pre-case assignments (see section 9.3.4), then every
branch of a case statement must make assignments to the same variables,
including the default branch.
An alternate style is to assign an X value for the default branch, instead of a known
value. This style has very different behavior for both simulation and synthesis, and is
discussed separately, in section 9.3.6 (page 345).
338 RTL Modeling with SystemVerilog for Simulation and Synthesis
always_comb begin
next_state = RESET; // default to reset if invalid state
case (current_state)
RESET : next_state = READY;
READY : next_state = SET;
SET : next_state = GO;
GO : next_state = READY;
endcase
end
Figure 9-5 shows the results of synthesizing the same simple state machine shown
in Example 9-2 (page 334), but with the a pre-case assignment before the incomplete
case statement. Observe that no latches are inferred for next state.
Chapter 9: Modeling Latches and Avoiding Unintentional Latches 339
The default assignment of known values within a case statement (style 1) and the
pre-case assignment (style 2) produce similar synthesis Quality of Results (QoR).
This can be seen by comparing Figure 9-5 with Figure 9-4 (page 336).
Pros and cons of pre-case assignments before decision statements. One advan
tage of having a pre-case assignment is that designers do not need to be concerned
about ensuring that decision statements are complete in the rest of the procedure.
Another advantage is that the decision statements can focus on which combinational
outputs are significant for each branch of the decision. In the code snippet above, it is
obvious that the r e a d y state affects the get_ready output, SET affects the get_set
output, and so forth.
A disadvantage of this coding style is it can be more difficult to see what is
assigned to all the variables used by the procedure by looking at a specific case item
branch. It is necessary to look at both the pre-case assignment and the case item
assignments to see all the values assigned. In a large, complex decoder, these assign
ments could be separated by many lines of code.
An alternate coding style is to assign an X value for the pre-case assignment,
instead of a known value. This style has important trade-offs on design quality and
robustness, which are discussed in Section 9.3.6 (page 345).
340 RTL Modeling with SystemVerilog for Simulation and Synthesis
Section 9.3.3 (page 335) used a default case item to avoid inferring unintentional
latches in the next state decoder. Section 9.3.4 (page 338) accomplished latch avoid
ance by having a pre-case assignment prior to the decision statement. Both coding
styles behave as a fully specified decision in RTL simulations. Synthesis will imple
ment decoder logic for all case item values, including values that are not used by the
design. This is a safe coding style. Should a glitch or some other circumstance cause
an unexpected value on the case expression signal, the additional logic gates will
decode the value and perform as specified — and verified — in the RTL default case
item assignment or the pre-case assignment.
The disadvantage of these styles is that, if the unspecified values never occur, the
ASIC or FPGA contains logic gates and propagation paths that are never used. These
additional gates and propagation paths can make the IC larger, slower and less power
efficient. The functionality to decode values that never occur can be costly for designs
that push clock speeds or device gate count to their limits, or need to reduce power
consumption as much as possible.
SystemVerilog has three decision modifiers, unique, uniqueO and priority,
that can enable certain gate-level reduction optimizations during the synthesis pro
cess. These decision modifiers are specified immediately before the case, case-
inside, casez or casex keyword, as in:
always_comb begin
unique case (current_state)
RESET : next_state = READY;
READY : next_state = SET;
SET : next_state = GO;
GO : next_state = READY;
endcase
end
These decision modifiers can also be specified with an if-else-if decision series:
always_comb begin
unique if (opcode == 2'bOO) y = a + b;
else if (opcode == 2'b01) y = a - b;
else if (opcode == 2'blO) y = a * b;
end
These decision modifiers affect both simulation and how synthesis compilers trans
late RTL code into a gate-level implementation.
Chapter 9: Modeling Latches and Avoiding Unintentional Latches 341
Simulation. The unique decision modifier enables run-time checking for two con
ditions:
1. A case item or if-else condition has been specified for all values that actually
occur during simulation. This means a decision branch is executed for every case
expression value that occurs during simulation.
2. There are never multiple decision branches true at the same time, meaning each
case item decodes a value that is unique from every other case item value.
Synthesis. The unique modifier instructs synthesis to perform two types of optimi
zations:
1. Treat the decision statement as fully specified (sometimes referred to as full
case), and perform appropriated logic reduction optimizations.
2. Treat the decision statement conditions as mutually exclusive, meaning there will
never be multiple conditions true at the same time (sometimes referred to as par
allel case), and perform optimization to evaluate the decision conditions in paral
lel, instead of with priority encoded logic.
Using unique is appropriate for functionality such as a one-hot state decoder,
where all one-hot values need to be decoded, and any value that is not one-hot should
never occur. Nor should multiple case items decode the same one-hot value.
Synthesis. *The uniqueO modifier instructs synthesis to treat the decision statement
conditions as mutually exclusive {parallel case), and perform optimizations to
evaluate the decision conditions in parallel, instead of with priority encoded logic.
N O TE
At the time this book was written, some simulators and most synthesis com
pilers did not support the uniqueO decision modifier.
Use the unique decision modifier in RTL models when logic reduction is
desirable to prevent inferred latches. Do not use the uniqueO modifier.
342 RTL Modeling with SystemVerilog for Simulation and Synthesis
The uniqueO decision modifier does not help prevent unintentional latches, since
it does not instruct synthesis to treat a decision as being fully specified.
An appropriate usage of the uniqueO decision modifier is when a decision state
ment will not infer latches, but a synthesis cannot recognize that the case items can be
evaluated in parallel. The following code snippet illustrates this circumstance.
typedef enum logic [2:0] {A = 3'bl00,
B = 3'bOlO,
C = 3'b001} modes_t;
modes_t selector_switch;
always_comb begin
control = 4 'hi; // assume switch is in A position
uniqueO case (selector_switch) inside
4'b?l?: control = 4'h2; // switch is in B position
4'b??l: control = 4'h3; // switch is in C position
endcase
end
In this example, no latch will be inferred because of the pre-case assignment to the
control output. However, synthesis compilers will not recognize that the case item
values can be evaluated in parallel, because the wildcard bits could potentially allow
two or more case items to be true at the same time. Adding the uniqueO modifier
tells synthesis that selector_switch will always have just one bit set, and therefore
can be evaluated in parallel.
The unique decision modifier would not be appropriate in this example. The
unique modifier would inform synthesis that the case statement was complete. Syn
thesis might interpret this to mean that the only values control can have are those
assigned within the case statement, which are the values of 2 and 3. The pre-case
assignment would be ignored, and no gate-level logic would be implemented to gen
erate a control value of 1.
In simulation, a unique decision statement will generate a run-time simulation vio
lation report any time the case statement is evaluated an no branch is taken. A viola
tion message would occur whenever the selector_switch has a value of A. This is
an important violation message! It is saying that the case statement is not decoding all
values that occur, and therefore should not be synthesized as if it were a complete
case statement.
The previous code snippet could have been coded as a reverse case statement where
the use of the unique decision modifier would be appropriate, as shown Chapter 8,
section 8.2.5 (page 313). The selector_switch decoding might also have been
coded without using wildcard bits in the case items, so that synthesis compilers could
recognize that the case item values are unique, and could be automatically optimized
as parallel decoding logic, without having to use a decision modifier.
Chapter 9: Modeling Latches and Avoiding Unintentional Latches 343
Simulation. The priority decision modifier enables run-time checking for one
condition:
A case item or if-else condition has been specified for all values that actually
occur during simulation. This means a decision branch is executed for every case
expression value that occurs during simulation.
Synthesis. The priority modifier instructs synthesis to treat the decision statement
as fully specified {full case), and perform appropriate logic reduction optimizations.
Using priority is appropriate for functionality such as a priority interrupt
decoder, when there is a possibility that more than one branch of the decision could be
true at the same time, and the highest-priority interrupt should be serviced first. Prior
ity encoded behavior is discussed in more detail in Chapter 7, section 7.4 (page 265).
Figure 9-6: Synthesis result when using a unique case statement to prevent latches
The next state decoding logic in this small state machine example is too simple to
benefit from the logic reduction optimizations that are enabled by the unique deci
sion statement modifier. Even so, some gate-level logic reduction can be seen by
comparing Figure 9-6 above, with the fully specified decision statement examples
shown in Figures 9-4 (page 336) and 9-5 (page 339).
Pros and cons of decision modifiers. The primary advantage of the unique and
priority decision modifiers is the optimization of the gate-level implementation of
the RTF functionality. The logic reduction optimizations triggered by the unique and
priority modifiers can result in smaller, faster designs that consume less power. A
secondary advantage of the unique and priority modifiers is that they can prevent
unintentional inferred latches from an incomplete case statement, so long as the vari
ables that are assigned values in the combinational always procedure are updated
every time the procedure executes.
A disadvantage of the unique and priority modifiers is that the gate-level logic
reduction optimizations can lead to a gate-level implementation that is not robust, and
can have unpredictable or undesired behavior, should a hardware glitch occur that
causes an unexpected value on the case expression. This could result in an ASIC of
FPGA that does not work under all conditions. The run-time simulation checking that
is part of the unique and priority modifiers can help reduce the risks of logic
reduction optimization by verifying that the expression values not decoded never
occur. This verification, however, is only as effective as the test stimulus.
In general, fully specify all case statements using either a default case item
that assigns known values, or a pre-case assignment with known values. An
exception to this guideline is a one-hot state decoder using a reverse case
statement.
Only use the unique or priority decision modifiers if it is certain that the gate-
level logic reduction optimization is desirable. An exception to this guideline is a
reverse case statement, where using a unique or priority case statement is the pre
ferred way to avoid unintentional latches.
When decision statements are fully specified, a design will be more predictable for
unexpected conditions behave, such as power-on glitches or glitches resulting from
interference. A fully specified case statement with known values assigned for all pos
sible conditions can be important when designing fault-tolerant functionality. Fault
tolerance affects many aspects of a design, not just decision statements. This is a gen
eral engineering topic that is outside the scope of this book.
In the 1980s and 1990s, it was much more important to take every advantage of
gate-level minimization techniques in order to fit designs into the capacity and speed
of the ASICs and FPGAs available. This is not as critical with today’s ASIC and
Chapter 9: Modeling Latches and Avoiding Unintentional Latches 345
FPGA technologies. Most designs will fit and run at the desired speed without con
cern for gate-level logic reduction, with its associated risks.
N O TE
A unique case or priority case does not guarantee that latches will
not be inferred. A latch will be inferred any time there is a possibility that a
non-clocked procedural block can be entered and a variable is not updated.
If the procedure does not have any pre-case assignments (see section 9.3.4,
page 338), then every branch of a case statement must make assignments to
the same variables in order to avoid unintentional latches (see section 9.2
(page 327) for ways in which a latch might be inferred).
always_comb begin
next_state = states_t'('x); //case stmt should clear X's
case (current_state)
RESET :next_state = READY;
READY :next_state = SET;
SET :next_state = GO;
GO :next_state = READY;
endcase
end
Observe that, when assigning to an enumerated variable, only labels in the enumer
ated definition can be directly assigned. The cast operator used in the examples above
overrides this restriction and forces the enumerated variable to an X value. An alter
nate way to do this is to add another label to the enumerated type definition that has
an X value, and assign that label in the default branch. For example:
typedef enum logic [2:0] {RESET =3'bOOO, // Johnson Count
READY =3'bOOl,
SET =3'bOll,
GO =3 'bill,
XXX =3'bXXX} states_t;
always_comb begin
case (current_state)
RESET :next_state = READY;
READY :next_state = SET;
SET :next_state = GO;
GO :next_state = READY;
default: next_state = XXX; // don't care branch
endcase
end
Assigning a value of X as a default affects simulation and synthesis differently:
• Simulators will propagate an X onto the output variable(s) if an unexpected deci
sion value that is not decoded should occur. In the example above, a
current_state value of 3'bOlO, 3'blOO, 3'bl01 or 3'bllO will result in a
next_state value of 3'bxxx. This X value can be caught in verification, and
traced back to the unexpected value of current_state.
• Synthesis compilers treat a default assignment of an X value as a special flag to
indicates that decision values that were not explicitly decoded in the decision state
ment are not of interest, and can be ignored. Synthesis will apply logic reduction
optimizations to minimize the gate-level implementation, so that the logic gates
only decode logic for the values explicitly listed in the decision statement.
Figure 9-7 shows the results of synthesizing the state machine shown in Example 9-
2 (page 334), but with a default case item that assigns an X value.
Chapter 9: Modeling Latches and Avoiding Unintentional Latches 347
Figure 9-7: Synthesis result using a default case X assignment to prevent latches
The gate-level optimization for this simple next state decoder is the same with
either the default X value assignment or the pre-case X value default assignment.
There can be other advantages to using one style versus the other, however. The pros
and cons of default case items versus pre-case assignments are discussed in sections
9.3.3 (page 335) and 9.3.4 (page 338).
To avoid latches from an incomplete case statement (but not an if-else statement),
synthesis compilers look for the pragma comment:
NO TE
At the time this book was written, one commercial synthesis compiler did
not recognize / / synthesis as a synthesis pragma. That compiler required
that pragmas start with / / pragma or / / synopsys.
The full_case pragma instructs synthesis compilers to assume that a case state
ment is fully specified, as described in section 9.2 (page 327). This means that synthe
sis can ignore any values for the case expression that do not match a case item.
Synthesis will perform the same gate-level logic reduction optimizations described
for the unique and priority decision modifiers (see 9.3.5, page 340).
An example of using the full_case pragma is:
Chapter 9: Modeling Latches and Avoiding Unintentional Latches 351
always_comb begin
case (select) // synthesis full case
2'b00: y = a;
2'b01: y = b;
2'blO: y = c;
endcase
end
N O TE
The gate-level logic reduction that can occur from the full_case pragma is
a very different behavior than the simulation behavior of the RTL model.
This means the gate-level implementation was not verified in RTL simula
tion, which can lead to a gate-level implementation that does not work as
intended.
Modem ASIC and FPGA designs have the speed and capacity to fully implement
all possible decision values, even those that are not expected to be used by the design.
Gate-level logic reduction for unused decision values is usually not needed. If, and
only if, these gate-level synthesis optimizations are needed in the design, the unique
or priority decision modifier should be used instead of the full_case pragma
comment. The SystemVerilog unique or priority decision modifiers have the
same synthesis optimization effects, but also report violation messages if a decision
expression value occurs that is not decoded in the decision statement. These viola
tions help to verify that the gate-level optimizations are safe to use.
A full_case pragma does not guarantee that latches will not be inferred. There
are other things that can infer latches, in addition to a case statement that is not “full”
(complete). Latches will be inferred if there is a possibility that any of the variables
assigned in a combinational logic procedure will not be updated every time the
always procedure is entered.
Synthesis compilers also look for a parallel_case pragma, which enables spe
cific gate-level optimizations of case statements. This pragma is discussed in Chapter
7, sections 7.4.3 (page 270). Just like the full_case pragma, using the
parallel_case pragma can result in synthesis compilers creating a gate-level
implementation that behaves very differently than the RTL model that was verified in
simulation.
The SystemVerilog priority, unique and uniqueO decision modifiers replace
the obsolete full_case and parallel_case synthesis pragmas. These modifiers
inform synthesis to do the same optimizations as the pragmas, and — importantly —
also enable run-time simulation checking to help verify that design conditions do not
352 RTL Modeling with SystemVerilog for Simulation and Synthesis
occur that would not work with the gate-level optimizations. These run-time checks
are described in section 9.3.5 (page 340).
• priority enables the same synthesis gate-level optimizations as the pragma
// synthesis full_case
• unique enables the same synthesis gate-level optimizations as the pragma
// synthesis full_case parallel_case
• uniqueO enables the same synthesis gate-level optimizations as the pragma
// synthesis parallel case.
NO TE
At the time this book was written, one commercial synthesis compiler did
not recognize / / synthesis as a synthesis pragma. That compiler required
that pragmas start with / / pragma or / / synopsys.
The pragma pair translate_of f tells synthesis compilers to ignore all code that
follows, until a translate_on pragma is encountered. This allows debug code to be
embedded into RTL code. Simulation, which ignores comments, will compile and
execute the debug statements, but synthesis compilers will not try to implement the
code in the target ASIC or FPGA. An alternative to using the translate_of f and
translate_on synthesis pragmas is to use conditional compilation. Most synthesis
compilers have a predefined SYNTHESIS macro that can be used to conditionally
include or exclude code that the synthesis tool compiles.
9.4 Summary
This chapter has presented the best-practice coding style for modeling intentional
latches as RTL models. Latch behavior in RTL simulations will occur anytime a non-
clocked (no posedge or negedge sensitivity) always procedure triggers, and one or
more variables are not assigned a value. In simulation, the variable will keep its previ
ous value. This state retention requires some form of latch at the gate-level circuitry.
Synthesis will automatically recognize when there is a potential for state retention in
the RTL code, and infer latches in the ASIC or FPGA implementation.
The old-fashioned coding style of using X value default assignments was discussed,
along with several disadvantages of this older coding style. The obsolete full_case
synthesis pragma was also discussed, along with the reasons it should never be used.
The priority, unique and uniqueO decision modifiers enable synthesis optimiza
tions, and replace the old synthesis full_case and parallel_case pragmas.
These decision modifiers add RTL verification checks to help ensure that the gate-
level optimizations will work as intended.
355
Chapter 10
Modeling Communication Buses —
Interface Ports
Abstract — Designs often use standard bus protocols such as PCI Express, USB, or
AMBA AXI. Bus protocols bundle together several signals, including data signals,
address signals, and various control signals. Bus protocols require functionality on
each end of the bus to set and clear control lines in a specified order, and to transfer
data and address values.
SystemVerilog interfaces are a type of module port, but are more versatile than a
simple input, output or inout port. In its simplest form, an interface port bundles
related signals together as a single, compound port. For example, all the individual
signals that make up an AMBA AXI bus can be grouped together as an interface port.
An interface can do more than just encapsulate bus signals. SystemVerilog interfaces
provide a means for designers to centralize the functionality of a bus, as opposed to
having the functionality scattered throughout several modules in a design. This sim
plifies the design engineer’s work at the RTL level, and lets synthesis do the work of
distributing the gate-level bus hardware appropriately throughout the design.
Interfaces are synthesizable when specific modeling guidelines and restrictions are
followed. Interfaces can also be used at a non-synthesizable transaction level of mod
eling, and as part of a verification testbench. Advanced verification methodologies
such as UVM, OVM and VMM, utilize interfaces.
This chapter examines interfaces as a synthesizable RTL modeling construct. The
concepts covered in this chapter are:
• Interface declarations
• Connecting interfaces to module ports
• Differences between interfaces and modules
• Interface ports and port directions (modports)
• Tasks and functions in interfaces (interface methods)
• Procedural blocks in interfaces
• Parameterized interfaces
356 RTL Modeling with SystemVerilog for Simulation and Synthesis
This simple AHB bus communicates between a single master and single slave mod
ule, and therefore does not require the bus arbiter and decoder blocks that a full
AMBA AHB bus would need.
Chapter 10: Modeling Communication Buses — Interface Ports 357
Figure 10-1: Block diagram connecting a Master and Slave using separate ports
Example 10-1 shows the code for connecting the master and slaves modules illus
trated in Figure 10-1. Observe the repetition of declarations for the 8 signals that com
prise a the simple AHB bus. The same signals must be declared in the master module,
the slave module, the chip-level module that connects master and slave, and in the
connections to the module instances for master and slave. Example 10-1 is modeled
using traditional Verilog-2001 style and data types.
358 RTL Modeling with SystemVerilog for Simulation and Synthesis
Example 10-1: Master and slave module connections using separate ports
77777777777777777777777777777777777777777777777777777777777
// Master Module Port List -- Verilog-2001 style
///////////////////////////////////////////////////////////
m o d u l e master (
// simplified AHB bus signals
in p u t w ir e hoik,
in p u t w ir e hresetN,
o u tp u t r e g [31:0] haddr,
o u tp u t r e g [31:0] hwdata, M a ster m od u le n eed s in d iv id u a l p orts
o u tp u t r e g hwrite, ^ for th e sim p lified A H B sign als.
o u tp u t r e g [ 2:0] hsize,
in p u t w i r e [31:0] hrdata,
in p u t w ir e hready, J
// other signals
in p u t w ir e m elk, // master clock
in p u t w ir e rstN, // reset, active low
in p u t w i r e [7:0] thingl, // misc signal; not part of bus
o u tp u t r e g [7:0] thing2 // misc signal; not part of bus
);
... // master module functionality not shown
e n d m o d u l e : master
///////////////////////////////////////////////////////////
// Slave Module Port List -- Verilog-2001 style
///////////////////////////////////////////////////////////
m o d u l e slave (
// simplified AHB bus signals
in p u t w ir e hoik, >
in p u t w ir e hresetN,
in p u t w i r e [31:0] haddr, S lave m od u le n eed s d u p lica te p orts
in p u t w i r e [31:0] hwdata, for th e sim p lified A H B sign als.
in p u t w ir e hwrite, ^ V ector sizes m u st m atch for correct
in p u t w i r e [ 2:0] hsize, fu n ctio n a lity
o u tp u t r e g [31:0] hrdata,
o u tp u t r e g hready, J
// other signals
in p u t w ir e s elk, // slave clock
in p u t w ir e rstN, // reset, active low
o u tp u t r e g [7:0] thingl, // misc. signal; not part of bus
in p u t w i r e [7:0] thing2 // misc. signal; not part of bus
);
... // slave module functionality not shown
e n d m o d u l e : slave
Chapter 10: Modeling Communication Buses — Interface Ports 359
///////////////////////////////////////////////////////////
// Top-level Netlist Module -- Verilog-2001 style
///////////////////////////////////////////////////////////
m odule chip_top;
// Simplified AHB bus signals
w ir e hclk; >
w ir e hresetN;
w ir e [31:0] haddr;
Higher-level module must duplicate
w ir e [31:0] hwdata;
^ the simplified AHB signals again in
w ir e hwrite; order to connect master and slave.
w ir e [ 2:0] hsize;
w ir e [31:0] hrdata;
w ir e hready; J
// Other signals
w ir e m elk; // master clock
w ir e s elk; // slave clock
w ir e chip rstN; // reset, active low
w ir e [7:0] thingl; // misc signal; not part of bus
w ir e [7:0] thing2; // misc signal; not part of bus
master m (// simplified AHB bus connections
.hoik(hclk), >
.hresetN(hresetN), Connection to the master ports must
.haddr(haddr), duplicate the simplified AHB signals
.hwdata(hwdata), I again (SystemVerilog’s dot-name or
.hsize(hsize), f dot-star shortcuts could reduce this
.hwrite(hwrite), redundancy if all names match
.hrdata(hrdata), exactly.
.hready(hready),
// Other connections
.m_clk(m_clk),
.rstN(chip_rstN),
.thingl(thingl),
.thing2(thing2)
);
slave s (// simplified AHB bus connections
.hclk(hclk), >
.hresetN(hresetN), Connection to the slave ports dupli
.haddr(haddr), cates the simplified AHB signals again
.hwdata(hwdata), (SystemVerilog’s dot-name or dot-star
.hsize(hsize), [ shortcuts could reduce this redun
.hwrite(hwrite), dancy if all names match exactly.
.hrdata(hrdata),
.hready(hready),
// Other connections
.s_clk(s_clk),
.rstN(chip_rstN),
.thingl(thingl),
.thing2(thing2)
360 RTL Modeling with SystemVerilog for Simulation and Synthesis
e n d m o d u le : chip top
Disadvantages of separate module ports. Using separate module ports for the bus
signals provides a simple and intuitive way of describing the interconnections
between the blocks of a design. The individual ports accurately model the signals that
make up the physical implementation of the bus. In large, complex designs, however,
using individual module ports have several shortcomings. Some of these disadvan
tages are:
• Declarations must be duplicated in multiple modules.
• Communication protocols, such a handshake sequence, must be duplicated in sev
eral modules.
• There is a risk of mismatched declarations in different modules.
• A change in the design specification can require modifications in multiple modules.
The signals that make the simplified AHB bus in the preceding example must be
declared in each module that uses the bus, as well as in the top-level netlist that con
nects the master and slave modules together. Even with the simplified AHB bus
example listed above — which only uses 8 of the 19 AHB bus signals, and only has a
single slave module — the duplication of names is obvious. Each AHB signal is
named a total of 7 times!
This duplication not only requires typing in lot of lines of code, but has a high
potential for coding mistakes. A mis-typed name or incorrect vector size in one place
can result in a functional bug in the design that is not detected until late in the design
process when modules are connected together for full chip verification.
The replicated port declarations also mean that, should the specification of the bus
change during the design process (or in a next generation of the design), each and
every module that shares the bus must be changed. The netlists used to connect the
modules using the bus must also be changed. This widespread effect of a change is
counter to good coding style. One goal of good coding is to structure the code in such
a way that a small change in one place should not require changing other areas of the
code. A weakness of using discrete input and output ports is that a change to the ports
in one module will usually require changes in other files.
Another disadvantage of using discrete input and output module ports is that com
munication protocols must be duplicated in each module that utilizes the intercon
necting signals between the modules. If, for example, three modules read and write
from a shared memory device, then the read and write control logic must be dupli
cated in each of these modules.
Chapter 10: Modeling Communication Buses — Interface Ports 361
Figure 10-2: Block diagram connecting a Master and Slave using interface ports
m_clk s_clk
Master P*---- 1 rstN 1---- * Slave
module 6 *--------- 1---------N module
interface
hclk
hresetN
haddr
hwdata
interface interface
port hrdata port
hsize
hwrite
hready
thingl
□-
thing 2
9*
The following three-part example shows how using interfaces can reduce the
amount of code required to model the simple AHB communication bus shown in sec
tion 10.1.1 (page 357) above.
Example 10-2 shows a definition of an interface component that encapsulates the
signals that make up the simple AHB as an interface.
Example 10-3 (page 363) shows the definitions of the master and slave modules.
The 8 separate ports on the master module for the simple AHB bus have been
replaced by single interface port. Instead of declaring this interface port as i n p u t ,
o u t p u t or i n o u t , the interface port is declared as simple_ahb, which is the name
of the interface defined in Example 10-2. The interface ports eliminate the redundant
simple AHB signal declarations within the master and slave modules when traditional
individual input and output ports are used, as was the case in Example 10-1.
Example 10-4 (page 364) shows the higher-level netlist that connects the master
and slave modules. Gone are the 24 lines of code to declare 8 separate bus signals and
362 RTL Modeling with SystemVerilog for Simulation and Synthesis
then connect those 8 signals to the master and the slave module ports that were listed
in Example 10-1. Instead, the simple_ahb interface is instantiated in the same way
as a module, and the instance name is connected to the interface ports of the master
and slave module instances.
Example 10-2: An interface definition for the 8-signal simple AMBA AHB bus
77777777777777777777777777777777777777777777777777777777777 "
Ports on interfaces. An interface can have input, output, and inout ports just like a
module. The simple AHB interface shown in Example 10-2 has two input ports, hclk
and hresetN. These signals are generated outside of the interface and passed into the
interface through the two input ports. The declaration of ports on an interface is the
same as for ports on modules.
An interface can also have interface ports, the same way a module can have inter
face ports. This allows one interface to be connected to another interface. For exam
ple, the main bus of a design might have one or more sub-busses. Both the main bus
and its sub-busses can be modeled as interfaces, and sub-bus interfaces can be used as
ports of the main bus interface.
Chapter 10: Modeling Communication Buses — Interface Ports 363
Interface modports, a first look. The interface definition above includes two mod-
port definitions, with the names master_ports and slave_ports. The keyword
modport is an abbreviation for “module’s ports”, and defines whether a module sees
the signals in the interface as inputs to the module or outputs from the module. One
advantage of interfaces is that the data types and vector sizes of the signals used in the
bus protocol are defined one time. The modport definitions simply add a direction,
from a module’s perspective, to the signals defined in the interface. Modport defini
tions are covered in more detail in section 10.3 (page 367).
The following example of a master and slave module illustrates using the
simple_ahb interface as a port to each module. Observe how the single interface
port replaces the 8 discrete input and output ports shown in the master module, and 8
more ports in the slave module, as shown in Example 10-1 (page 358).
/ / / / / / / / / III III/ / / / / / / III III III III///III IIII III III III III II/
/ / Slave Module Port List
///////////////////////////////////////////////////////////
module slave
(simpleahb.slaveports ahb, <----- Interface port with modport
// other ports
input logic s_clk, // slave clock
input logic rstN, // reset, active low
output logic [7:0] thingl, // misc signal; not part of bus
input logic [7:0] thing2 // misc signal; not part of bus
);
... // slave module functionality not shown
endmodule: s1ave
Connecting interface ports, a first look. With traditional module ports, a top-level
module must declare individual nets for the bus signals, and then make separate con
nections for each individual signal to the ports of each module instance. Interfaces
364 RTL Modeling with SystemVerilog for Simulation and Synthesis
greatly simply these connections. When a module with an interface port is instanti
ated, an instance of an interface is connected to the interface port.
The following code instantiates the simple_ahb interface and gives it an instance
name of ahbl. This instance name is then used in the port connections of the master
and slave module instances.
Example 10-4: Netlist connecting the master and slave interface ports
/ / / / / III III II III III III III I I I / / / I I I I I I / / / / / / / III I I I / / / / / / / / /
I I Top-level Netlist Module -- SystemVerilog-2012 style
///////////////////////////////////////////////////////////
module chip top;
logic m elk; // master clock
logic s elk; // slave clock
logic hclk; // AHB bus clock
logic hresetN; // AHB bus reset, active low
logic chip rstN; // reset, active low
logic [7:0] thingl; // misc signal; not part of bus
logic [7:0] thing2; // misc signal; not part of bus
In the examples above, all the signals that make up the simple AHB bus protocol
have been encapsulated into the simple_ahb interface. The master, slave, and top-
level modules do not duplicate the declarations of these bus signals. Instead, the mas
ter and slave modules simply use the interface as the connection between the mod
ules. The interface eliminates the redundant declarations of separate module ports.
Chapter 10: Modeling Communication Buses — Interface Ports 365
N O TE
Section 10.2 (page 366) provides additional details on declaring module interface
ports and connecting to interface ports.
An interface port is a compound port that has signals inside the port. Within a mod
ule that has an interface port, the signals inside the interface are accessed by using the
port name, using the following syntax:
<port_name>.<internal_interface_signal_name>
The simple_ahb interface above contains a signal called helk, and master has
an interface port named of ahb. The master module can access hclk by using
ahb.hclk.
always ff @(posedge ahb.hclk)
Use short names for interface port names in RTL models. The port name will
need to be referenced frequently in the RTL code.
Since signals within an interface are accessed by prepending the interface port
name to the signal name, it is convenient to use short names for interface port names.
There are three fundamental differences between an interface and a module. First,
an interface cannot contain design hierarchy. Unlike a module, an interface cannot
contain instances of modules or primitives that would create a new level of imple
mentation hierarchy. Second, an interface can be used as a module port, which is what
allows interfaces to represent communication channels between modules. It is illegal
to use a module in a port list. Third, an interface can contain modports, which allow
each module connected to the interface to see the interface differently. Modports are
described in detail in section 10.3 (page 367).
366 RTL Modeling with SystemVerilog for Simulation and Synthesis
A generic interface port defines the port type by using the keyword interface,
instead of a using the name of a specific interface type. The syntax is:
module <module_name> (interface <port_name>);
When the module is instantiated, any type of interface can be connected to the
generic interface port. This provides flexibility, in that the same module can be used
in multiple ways, with different interfaces connected to the module. In the following
example, module bridge is defined with two generic interface ports:
interface ahb_bus;
... // signal declarations for an AMBA AHB bus
endinterface
interface usb_bus;
... // signal declarations for a USB bus
endinterface
endmodule
Each generic interface port could have either an ahb_bus interface instance or a
usb_bus interface instance connected to it (or any other type of interface).
Chapter 10: Modeling Communication Buses — Interface Ports 367
endmodule
A type-specific interface port can only be connected to an instance of an interface
of the same type. In the example above, a higher-level netlist could instantiate the
c a c h e module and connect an instance of an ahb_bus interface, but could not con
nect an instance of an usb_bus interface. A simulator or synthesis tool will issue an
elaboration error if the wrong type of interface instance is connected to the type-spe
cific interface port. Type-specific interface ports ensure that a wrong interface can
never be inadvertently connected to the port. Explicitly naming the interface type that
can be connected to the port also makes the port type more obvious to anyone else
who needs to review or maintain the module. With type-specific interface ports, it is
easier to see exactly how the port is intended to be used.
Use type-specific interface ports for RTL models. Do not use generic inter
face ports in design modules.
The functionality of a module will almost always need to reference signals within
the interface. With a type-specific interface port, the signal names within the interface
are known at the time the module is written, and can be referenced without concern.
With generic interface ports, there is no guarantee that every interface instance con
nected to the module’s interface port will have the same signal names within the inter
face.
SystemVerilog provides two methods for specifying which modport view a module
interface port should use:
• As part of the interface port declaration in the module definition
• As part of the interface connection to a module instance
Chapter 10: Modeling Communication Buses — Interface Ports 369
Both of these styles are synthesizable, but there are advantages to specifying the
modport as part of the module port definition, which are discussed in the following
paragraphs.
Selecting the modport in the module’s interface port declaration. The specific
modport to be used from an interface can be specified directly as part of the interface
port declaration within the module. The modport to be connected to the interface is
specified as:
cinterface_name>.<modport_name>
For example:
module master
(simple_ahb.master_ports ahb, // interface port & modport
// other ports
input logic m_clk, // master clock
input logic rstN, // reset, active low
input logic [7:0] thingl, // misc signal; not part of bus
output logic [7:0] thing2 // misc signal; not part of bus
);
... // master module functionality not shown
endmodule: master
Only type-specific interface ports can specify a modport as part of the port declara
tion. A generic interface port cannot specify a modport.
At the higher-level module that instantiates and connects this master module, an
instance of the interface is connected to the module port, without specifying the name
of a modport. For example:
module chip_top;
... // local net declarations
Selecting the modport in the module instance. An alternate coding style is to leave
the modport selection out of the module definition, and instead postpone selecting the
modport until the module is instantiated. The following example declares the first
370 RTL Modeling with SystemVerilog for Simulation and Synthesis
port of the slave module as a simple_ahb interface port, but does not specify which
modport definition to use.
module slave
(simple_ahb ahb, // interface port without modport
// other ports
input logic s elk, // slave clock
input logic rstN, // reset, active: low
output logic [7:0] thingl, // misc signal; not part of bus
input logic [7:0] thing2 // misc signal; not part of bus
);
... // slave module functionality not shown
endmodule: s1ave
The specific modport of the interface can then be specified when the module is
instantiated, and an instance of an interface is connected to a module instance. The
connection is specified as:
cinterface instance name>.<modport name>
For example:
slave s (.ahb(ahbl.slave_ports), // select slave modport
.rstN(chip_rstN),
.* // wildcard connection shortcut
When the modport to be used is specified in the module instance, the module defi
nition can use either a type-specific interface port or a generic interface port type, as
discussed in section 10.2 (page 366).
NO TE
A modport can be selected in either the module port definition or the module
instance, but not both.
Specifying the modport as part of the port declaration also allows the module to be
synthesized independently from other modules. It also helps make the module more
self-documenting. Engineers who read or maintain the module can immediately see
which modport is to be used with that module.
Chapter 10: Modeling Communication Buses — Interface Ports 371
Example 10-5: Interface with modports for custom views of interface signals
interface simple_ahb (
input logic hclk, // bus transfer elk
input logic hresetN // bus reset, active low
);
logic [31:0] haddr; // transfer start address
logic [31:0] hwdata; // data sent to slave
logic [31:0] hrdata; // return data from slave
logic [ 2:0] hsize; // transfer size
logic hwrite; // 1 for write, 0 for read
logic hready; // 1 for transfer finished
// additional AHB signal s only used by bus master
logic [ 3:0] hprot; // tranfer protection mode
logic [ 2:0] hburst; // transfer burst mode
logic [ 1:0] htrans; // transfer type
372 RTL Modeling with SystemVerilog for Simulation and Synthesis
A module that uses the simple_ahb .master_ports modport can use the hprot,
hburst and htrans signals. A module that uses the simple_ahb.slave_ports
modport cannot access these 3 signals. Since these signals are not listed in the
slave_ports modport, it as if those signals do not even exist.
It is also possible to have internal signals within an interface that are not visible
through any of the modport views. These internal signals might be used by protocol
checkers or other functionality contained within the interface.
SystemVerilog interfaces can do more than just group related signals together.
Interfaces can also encapsulate functionality for the communication between mod
ules. By adding the communication functionality to the interface, each module that
uses the interface can simply reference the functionality, without having to duplicate
that functionality in each module. Encapsulated functionality in an interface can also
be verified independent of the modules that use the interface.
Functionality encapsulated in an interface can be defined by using tasks and func
tions. Tasks and functions in an interface are referred to as interface methods. The
interface methods (tasks and functions) can be imported into the modules that need
them by using an import statement within the modport definition for the module.
Importing functions in a modport is similar to importing functions from a package, as
described in Chapter 4, section 4.2.2 (page 104).
The following example adds two functions to the simple AHB interface — one to
generate a parity bit value (using odd parity), and another function to check that data
matches the calculated parity. The hwdata and hrdata vectors have been declared 1-
bit wider than in previous examples, with the extra bit used for as a parity bit.
Chapter 10: Modeling Communication Buses — Interface Ports 373
Example 10-6: Interface with internal methods (functions) for parity logic
interface simple_ahb (
input logic hclk, // bus transfer elk
input logic hresetN // bus reset, active low
);
logic [31:0] haddr; // transfer start address
logic [32:0] hwdata; // data to slave, with parity bit
logic [32:0] hrdata; // data from slave, with parity bit
logic [ 2:0] hsize; // transfer size
logic hwrite; // 1 for write, 0 for read
logic hready; // 1 for transfer finished
Synthesis compilers place the same RTL coding restrictions on the contents of an
interface that are placed in modules. One of these restrictions is that tasks must exe
cute in zero time. Using a void function instead of a task enforces this synthesis
restriction. Chapter 7, section 7.3 (page 263) discusses the advantages of using void
functions in synthesizable RTL models.
N O TE
require that functions must execute in zero simulation time, which adheres to the syn
thesis requirement of zero-delay interface functionality.
An interface can also contain verification routines and assertions. This verification
code can be hidden from synthesis by enclosing it in the pragma pair:
//synthesis translate off and //synthesis translate on.
In addition to task and functions methods, interfaces can also contain initial and
always procedural blocks and continuous assignments. Procedural code can be used
to model functionality within an interface that affects the information communicated
across the bus the interface represents.
Example 10-7 adds a clock generator for the simple AHB bus helk, and a reset
synchronizer for the bus hresetN. In the previous examples of this interface, these
signals were generated external to the interface, and passed in as input ports of the
simple AHB interface. This example replaces those inputs with the chip (or system)
level clock and reset, and uses these chip-level signals to generate a local bus clock
and bus reset. This local functionality then becomes part of the encapsulated bus com
munication between the master and slave modules.
Example 10-7: Interface with internal procedural code to generate bus functionality
interface simple_ahb (
input logic chip_clk, // external clock from the chip
input logic chip_rstN // bus reset, active low
);
logic hclk; // local bus transfer elk
logic hresetN // local bus reset, active low
logic [31:0] haddr; // transfer start address
logic [31:0] hwdata; // data sent to slave
logic [31:0] hrdata; // return data from slave
logic [ 2:0] hsize; // transfer size
logic hwrite; // 1 for write, 0 for read
logic hready; // 1 for transfer finished
Interfaces can use parameter redefinition in the same way as modules. This allows
interface models to be configurable, so that each instance of the interface can have a
different configuration. Parameters can be used in interfaces to make vector sizes and
other declarations within the interface reconfigurable by using SystemVerilog’s
parameter redefinition constructs. The parameter values of an interface can be rede
fined when the interface is instantiated, in the same way as module redefinition.
Chapter 3, section 3.8 (page 93) discusses the various styles of parameter redefinition.
The following variation of the simple AHB example adds parameters to make the
data vector widths configurable when the interface is instantiated. Any module inter
face ports to which the interface instance is connected will use the vector sizes of that
interface instance.
Example 10-8: Parameterized interface with configurable bus data word size
interface simple ahb
parameter DWIDTH=32) // Data bus width, 32-bit default
The following code snippet redefines the data word size of the interface in Example
10-8 to a 64-bit word size.
simple_ahb #(.DWIDTH(64)) ahbl(.hclk,
.hresetN
);
Interfaces are a powerful modeling construct that SystemVerilog added to the origi
nal Verilog HDL. An interface port is an abstraction from traditional Verilog model
ing, where a group of related signals had to be declared one signal at a time. Those
separate declarations then had to be duplicated in every module that used the related
signals, as well as at the block level that connected modules together.
In its most basic form, SystemVerilog interfaces encapsulate related signals
together as a reusable modeling component. The interface can then be used as a single
port on modules, replacing multiple individual ports for a group of related signals.
The modeling abstraction provided by interfaces can be a powerful tool for RTL
design engineers. Designers can define a group of related signals one time, as an
interface, and then use those signals any number of times without having to duplicate
the definitions.
Synthesis compilers handle using interfaces to encapsulate related signals very
well. Design engineers can work at a higher level of abstraction — with all the advan
tages of abstraction — and synthesis compilers translate the abstract encapsulation of
signals into the individual module ports, without engineers needing to get bogged
down with the individual port declarations, and ensuring that redundant declarations
in multiple modules match perfectly.
Synthesis compilers support both styles of specifying which modport is to be used
with a module as part of the port declaration, or when the module is installed (see sec
tion 10.3.1, page 368). However, the modport must be specified with the port declara
tion if a module is synthesized independently from other modules.
Synthesis compilers will expand the interface port of a module into the individual
ports represented in the modport definition when a module is synthesized independent
of other modules, or when multiple modules are synthesized with the synthesis com
piler configured to preserve the RTL module hierarchy. Most synthesis compilers will
use a Verilog-1995 port declaration style, where the port list contains the port names
and order, and the port sizes and data types are declared inside the module, instead of
in the port list. A module can have any number of interface ports, and the interface
ports can be specified in any order with other ports. The examples in this book list the
interface port first, simply to emphasize the interface port.
The following code snippets show the possible pre-synthesis and post-synthesis
module definitions of a master module that uses the simple_ahb interface shown in
Example 10-3 (page 363).
380 RTL Modeling with SystemVerilog for Simulation and Synthesis
The RTL synthesis compilers available at the time this book was written are some
what limited in the support of using interfaces to encapsulate functionality using tasks
and procedural code.
It is possible, for example, to encapsulate the full functionality of a FIFO within an
interface, which would allow modules that use the encapsulated signals to run at dif
ferent clock speeds without any loss of data. Complete error-correction functionality,
and other complex operations related to a group of signals, can also be bundled with
those signals. This more advanced level of encapsulation is not supported, or has only
limited support, by most synthesis compilers. These restrictions limit the usefulness
of procedural code in an interface.
Interfaces can also bundle verification code, such as assertions and self-checking
routines for the encapsulated signals and functionality. Verification related code in an
interface can be ignored by synthesis compilers using synthesis translate_of f and
translate_on pragmas or 'i f d e f conditional compilation.
382 RTL Modeling with SystemVerilog for Simulation and Synthesis
10.8 Summary
This chapter has presented interfaces and interface ports, powerful RTL modeling
constructs which SystemVerilog added to the original Verilog language. An interface
encapsulates the communication between major blocks of a design. Using interfaces,
the detailed and redundant module port and netlist declarations are greatly simplified.
The details are moved to an interface definition, where those bus details only need to
be defined once, instead of in many different modules.
The interface modport definition provides a simple yet powerful way to customize
the interface for each module that is connected to the interface. Each modport defini
tion defines the port directions for a particular view of the interface. One module can
see a specific signal in the interface as an output, while another module sees that same
signal as an input. Modport definitions also allow some signals or methods in an
interface to be hidden from certain modules.
Interfaces do more than provide a way to bundle signals together. Interfaces can
slso encapsulate functionality that operates on the related signals by using methods
(tasks and functions). The ability to incorporate methods in an interface further
reduces redundant code that is used in multiple modules. Methods are defined in one
place, in the interface, and can be imported into any number of modules as part of
each module’s modport definition. Functions in interfaces are synthesizable.
A synthesizable interface must adhere to the same RTL modeling rules as a synthe
sizable module. Interfaces are capable of modeling at non-RTL levels as well, and are
a powerful construct for transaction-level modeling and verification testbenches.
Advanced verification methodologies such as UVM, OVM and VMM rely on inter
faces to communicate between an object-oriented testbench and the design modules
being verified.
385
Appendix A
Best Practice Coding Guidelines
This book emphasizes writing RTL models that simulate and synthesize correctly,
and yield best Quality of Results (QoR) in the gate-level implementation created by
synthesis compilers. Each chapter several short “Best Practice Guideline” coding
recommendations. For convenience, this appendix provides a summary of these rec
ommendations.
Readers are encouraged to refer to the full description on each best practice coding
recommendation for the full details of the recommendation and to understand why it
is important.
3-3 Use the 4-state logic data type to infer variables in RTL models. Do not use 2-
state types in RTL models. An exception to this guideline is to use the int type
to declare for-loop iterator variables, (p. 68)
3-4 Use a simple vector declaration when a design mostly selects either the entire
vector or individual bits of the vector. Use a vector with subfields when a design
frequently selects parts of a vector, and those parts fall on known boundaries,
such as byte or word boundaries, (p. 74)
3-5 Only use variable initialization in RTL models that will be implemented as an
FPGA, and only to model power-up values of flip-flops, (p. 76)
3-6 Only use in-line variable initialization in RTL models. Do not use initial proce
dures to initialize variables. (p. 76)
3-7 Use a logic data type to connect design components together whenever the de
sign intent is to have single driver functionality. Use wire or tri net types only
when the design intent is to permit multiple drivers, (p. 78)
3-8 If the default net type is changed, always use ' def ault_nettype as a pair of
directives, with the first directive setting the default to the desired net type, and
the second directive setting the default back to wire. (p. 82)
3-9 Use the ANSI-C style declarations for module port lists. Declare both input ports
and output ports as a logic type. (p. 88)
3- 10 Use in-line named parameter redefinition for all parameter overrides. Do not use
in-line parameter-order redefinition or defparam statements, (p. 99)
4-2 Avoid using $unit like the Bubonic plague! Instead, use packages for shared
definitions. (P -114)
4-3 Use the explicit-style enumerated type declarations in RTL models, where the
base type and label values are specified, rather than inferred, (p. 115)
4-4 Only use packed unions in RTL models, (p. 132)
5-3 A function should only modify its function return variable and internal tempo
rary variables that never leave the function, (p. 164)
5-4 Avoid mixing signed and unsigned expressions with comparison operations.
Both operands should be either signed or unsigned, (p. 166)
5-5 Use the == and ! = equality operators in RTL models. Do not use the === and
! == case equality operators, (p. 169)
5-6 Use operators to shift or rotate a vector a variable number of bits. Do not use
loops to shift or rotate the bits of a vector a variable number of bits. (p. 180)
5-7 For better synthesis Quality of Results (QoR):
(a) Use shift operators for multiplication and division by a power of 2, instead
of the *, / , %and ** arithmetic operators.
(b) For multiplication and division by a non-power of 2, use a constant value for
one operand of the operation, if possible.
(c) For multiplication and division when both operands are non-constant values,
use smaller vector sizes, such as 8-bits. (p. 185)
5-8 Use unsigned types for all RTL model operations. The use of signed data types
is seldom needed to model accurate hardware behavior. (p. 189)
5- 9 Only use the increment and decrement operators with combinational logic pro
cedures and to control loops iterations. Do not use increment and decrement to
model sequential logic behavior, (p. 191)
6-6 Use the continue and break jump statements to control loop iterations. Do not
use the disable jump statement, (p. 241)
6-7 Do not use the no-op statement for RTL modeling, (p. 243)
388 RTL Modeling with SystemVerilog for Simulation and Synthesis
8-3 Use separate combinational logic processes to calculate intermediate values re
quired in sequential logic procedures. Do not embed intermediate calculations
inside of sequential logic procedures, (p. 283)
8-4 Declare temporary variables that are used in a sequential logic block as local
variables within the block, (p. 285)
8-5 Write RTL models using a preferred type of reset, and let synthesis compilers
map the reset functionality to the type of reset supported by the target ASIC or
FPGA. Only write RTL models to use the same type of reset that a specific target
ASIC or FPGA uses if it is necessary in order to achieve the most optimal speed
and area for that specific device, (p. 286)
8-6 Be consistent in the use of active-high or active-low resets. Use a consistent
naming convention for active-high and active-low control signals, (p. 289)
8-7 Model RTL flip-flops with just a reset input or a set input in order to achieve best
synthesis Quality of Results (QoR). Only model set/reset flip-flops if needed for
the functionality of the design, (p. 291)
8-8 Multiple clock designs should be partitioned into multiple modules, so that each
module only uses a single clock, (p. 295)
8-9 If intra-assignment delays are used at all, only use a unit delay, (p. 298)
8-10 Model FSMs in a separate module. (Support logic for the FSM, such as a counter
that is only used by the FSM, can be included in the same module.) (p. 299)
8-11 Define a logic (4-state) base type and vector size for enumerated
variables, (p. 303)
8-12 Use enumerated variables for FSM state variables. Do not use parameters and
loosely typed variables for state variables, (p. 303)
8-13 Make the engineering decision on which encoding scheme to use for a Finite
State Machine at the RTL modeling stage of design, rather than during the syn
thesis process, (p. 304)
8-14 For most Finite State Machines, use a three-process coding style, where a sepa
rate process models each of the three main blocks of the state machine, (p. 305)
8-15 Use reverse case statements to model one-hot state machines that evaluate 1-bit
values. Do not use multi-bit vectors for one-hot case expressions and case
items, (p. 313)
8- 16 Behavioral RAM models should be defined in a separate module, (p. 317)
9-3 Fully specify the output values of decision statements to avoid unintended latch
es. Do not use logic reduction optimizations to avoid latches, unless needed for
a specific circumstance, (p. 331)
9-4 Use the unique or priority decision modifiers if gate-level logic reduction is
needed for avoiding unintended latches. Do not use the antiquated Verilog-2001
coding style of X value assignments, (p. 331)
9-5 Use the unique decision modifier in RTL models when logic reduction is desir
able to prevent inferred latches. Do not use the uniqueO modifier. (p. 341)
9-6 In general, fully specify all case statements using either a default case item that
assigns known values, or a pre-case assignment with known values. An excep
tion to this guideline is a one-hot state decoder using a reverse case
statement. (P- 344)
9-7 Use the unique or priority decision modifiers instead of an X value assign
ment for unused decision values, (p. 345)
9- 8 Use the unique or priority decision modifiers if logic reduction optimiza
tions are required. Do not use the full_case (or parallel_case) synthesis
pragma. Ever. (p. 350)
Appendix B
SystemVerilog Reserved Keywords
Abstract — Each version of the Verilog and SystemVerilog standard has added addi
tional reserved keywords to the previous generation of the standard. This appendix
lists:
Table B.l — the full SystemVerilog-2012 reserved keyword list
Table B.2 — the original Verilog-1995 reserved keyword list
Table B.3 — additional keywords reserved in the Verilog-2001 standard
Table B.4 — additional keywords reserved in the Verilog-2005 standard
Table B.5 — additional keywords reserved in the System Verilog-2005 standard
Table B.6 — additional keywords reserved in the SystemVerilog-2009 standard
Table B.7 — additional keywords reserved in the SystemVerilog-2012 standard
Section 2.2.4 in Chapter 2 discusses using the 'begin_keywords and
'end_keywords compiler directive pair to control which keywords should be
reserved when SystemVerilog source code is compiled.
Table B-l lists the reserved keywords for the SystemVerilog-2012 standard. The
compiler directive 'begin_keywords "1800-2012" instructs compilers to reserve
the keywords listed in this table.
Note: Some keywords in this table have been hyphenated in order to fit the format
of this book. No actual keywords contain hyphens.
394 RTL Modeling with SystemVerilog for Simulation and Synthesis
Table B-2 lists the reserved keywords used in the original Verilog-1995 standard.
The compiler directive 'begin_keywords "1364-1995" instructs compilers to
reserve only the keywords listed in this table.
Table B-3 lists only the reserved keywords that were added with the Verilog-2001
language. The compiler directive 'begin_keywords "1364-2001" instructs com
pilers to reserve the keywords listed in this table, plus the keywords reserved in previ
ous versions.
The Verilog-2005 reserved only one additional keyword, which is listed Table B-4.
The compiler directive 'begin_keywords "1364-2005" instructs compilers to
reserve the keyword listed in this table, plus all keywords reserved in all previous ver
sions, as listed in Tables B-2 and B-3.
uwire
Table B-6 lists the reserved keywords that were added with the System Verilog-
2009 standard. The compiler directive 'begin_keywords "1800-2009" instructs
compilers to reserve the keywords listed in this table, plus all keywords reserved in all
previous versions.
a c c e p t on r e j e c t on sync a c c e p t on
checker restrict sync r e j e c t on
endchecker s always uniqueO
eventually s eventually until
global s nexttime until with
implies s until untyped
let s until with weak
nexttime strong
The SystemVerilog-2012 standard reserved four more keywords. Table B-7 lists
only the reserved keywords that were added with SystemVerilog-2012. The compiler
directive 'begin_keywords "1800-2012" instructs compilers to reserve the key
words listed in this table, plus all keywords reserved in previous versions.
implements nettype
interconnect soft
The SystemVerilog-2017 standard does not add any additional reserved keywords
to the SystemVerilog-2012 standard. Table B-7 lists only the reserved keywords that
were added with SystemVerilog-2012. The compiler directive
'begin_keywords "1800-2017" instructs a compiler to use the same reserved
keyword list as 'begin_keywords "1800-2012".
399
Appendix C
X Optimism and X Pessimism
in RTL Models
Abstract—This paper explores the advantages and hazards of X-optimism and X-pes-
simism, and of 2-state versus 4-state simulation. A number of papers have been writ
ten over the years on the problems of optimistic versus pessimistic X propagation in
simulation. Some papers argue that Verilog/SystemVerilog is overly optimistic, while
other papers argue that SystemVerilog can be overly pessimistic. Which viewpoint is
correct? Just a few years ago, some simulator companies were promising that 2-state
simulations would deliver substantially faster, more efficient simulation run-times,
compared to 4-state simulation. Now it seems the tables have turned, and Verilog/Sys
temVerilog simulators are providing modes that pessimistically propagate X values,
with the promise that 4-state simulation will more accurately and efficiently detect
obscure design bugs. Which promise is true? This paper answers these questions.
Keywords— Verilog, SystemVerilog, RTL simulation, 2-state, 4-state, Xpropagation,
X optimism, Xpessimism, register initialization, randomization, UVM
400 RTL Modeling with SystemVerilog for Simulation and Synthesis
C .l Introducing M y X
Terminology. For the purposes of this paper, X-optimism is defined as any time sim
ulation converts an X value on an expression or logic gate input into a 0 or a 1 on the
result. X-pessimism is defined as any time simulation passes an X on an input to an
expression or logic gate through to the result. As will be shown in this paper, some
times X-optimism is desirable, and sometimes it is not. Conversely, in different cir
cumstances, X-pessimism can be the right thing or the wrong thing.
Note: In this paper, the term “value sets” is used to refer to 2-state values (0 and 1)
and 4-state values (0, 1, Z, X). The term “data types” is used as a general term for all
net types, variable types, and user-defined types. The terms value sets and data types
are not used in the same way in the official IEEE SystemVerilog standard [3], which
is written primarily for companies that implement software tools such as simulators
and synthesis compilers. The SystemVerilog standard uses terms such as “types”,
“objects” and “kinds”, which have specific meaning for those that implement tools,
but which this author feels are neither common place nor intuitive for engineers that
use the SystemVerilog language.
Appendix C: X Optimism and X Pessimism in RTL Models 401
Logic X is a simulator’s way of saying that it cannot predict whether the value in
actual silicon would be 0 or 1. There are several conditions where simulation will
generate a logic X:
• Uninitialized 4-state variables
• Uninitialized registers and latches
• Low power logic shutdown or power-up
• Unconnected module input ports
• Multi-driver Conflicts (Bus Contention)
• Operations with an unknown result
• Out-of-range bit-selects and array indices
• Logic gates with unknown output values
• Setup or hold timing violations
• User-assigned X values in hardware models
• Testbench X injection
The var keyword explicitly declares a variable. It can be used by itself, or in con
junction with other keywords. In most contexts, the var keyword is optional, and is
seldom used.
var integer il; // same as "integer il"
var i2; // same as "var reg i2"
Example 1: The var variable type
The logic keyword is not a variable type or a net type. Nor is the bit keyword,
logic and bit define the digital value set that a net or variable models; logic indi
cates a 4-state value set (0, 1, Z, X) and bit indicates a 2-state value set (0, 1). The
reg, integer, time and var variable types infer a 4-state logic value set.
The logic keyword can be used in conjunction with the var, reg, integer or
time keyword or a net type keyword (such as wire) to explicitly indicate the value
set of the variable or net. For example:
var logic [31:0] v; // 4-state 32-bit variable
wire logic [31:0] w; // 4-state 32-bit net
Example 2: 4-state variable and net declarations
402 RTL Modeling with SystemVerilog for Simulation and Synthesis
The logic (or bit) keyword can be used without the var or a net type keyword.
In this case, either a variable or net is inferred, based on context. If logic or bit is
used in conjunction with an output port, an assign keyword, or as a local declara
tion, then a variable is inferred. If logic is used in conjunction with an input or
inout port declaration, then a net of the default net type is inferred (typically wire).
An input port can also be declared with a 4-state variable type, using either the key
word triplet input var logic or the keyword pair input var.
module ml (
input logic [7:0] i; // 4-state wire inferred
output logic [7:0] o; // 4-state var inferred
);
logic [7:0] t; // 4-state var inferred
•••
endmodule
Example 3: Default port data types
The SystemVerilog standard [3] defines that 4-state variables begin simulation with
an uninitialized value of X. This rule is one of the biggest causes of X values at the
start of simulation.
When reading bits from a vector, if the index is outside the range of bits in the vec
tor, a logic X is returned for each bit position that is out-of-range. When reading
members of an array, if the index is outside the range of addresses in the array, a logic
X is returned for the entire word being read. O f course, even in-range bit-selects, part-
selects and array selects can result in an X value being returned, if the vector or array
being selected contains X values.
Section C.4.8 (page 425), on X-pessimism, discusses reading vector bits and array
members with unknown indices. Section C.3.6 (page 414), on X-optimism, discusses
writing to vector bits and array members with unknown or out-of-range indices.
2'bll:
>
In this example, a select value of 2 ’b00 is not used by the design, and should
never occur. The default assignment of a logic X serves as a simulation flag, should
select ever have a value of 2’b00. The same default assignment of X serves as a
don’t care flag for synthesis. Synthesis tools see this X-assignment as an indication
that logic minimization can be performed for any values of the case expression
(select, in this example) that were not explicitly decoded.
Optimism: an inclination to put the most favorable construction upon actions and
events or to anticipate the best possible outcome. [4]
In simulation, X-optimism is when there is some uncertainty on an input to an
expression or gate (the silicon value might be either 0 or 1), but simulation comes up
with a known result instead of an X. SystemVerilog is, in general, an optimistic lan
guage. There are many conditions where an ambiguous condition exists in a model,
but SystemVerilog propagates a 0 or 1 instead of a logic X. A simple example of X-
optimism is an AND gate. In SystemVerilog, an X ANDed with 0 will result in 0, not
X.
An optimistic X can be a good thing! X-optimism can more accurately represents
silicon behavior when an ambiguous condition occurs in silicon. Consider the follow
ing example, shown in Figure C-l.
406 RTL Modeling with SystemVerilog for Simulation and Synthesis
d
rstN
q
elk
This circuit shows a flip-flop with synchronous, active-low reset. In actual silicon,
the d input might be ambiguous at power-up, powering up as either a 0 or 1. If the
rstN input of the AND gate is 0, however, the output of the AND gate will be 0,
despite the ambiguous power-up value of d. This correctly resets the flip-flop at the
next positive edge of e lk .
In simulation, the ambiguous power-up value of d is represented as an X. If this X
were to pessimistically propagate to the AND gate output, even when rstN is 0, the
design would not correctly reset, which could cause all sorts of problems. Fortunately,
SystemVerilog AND operators and AND gates are X-optimistic. If any input is 0, the
result is 0. Because of X-optimism, simulation accurately models silicon behavior,
and the simulated models function correctly.
An optimistic X can also be a bad thing! X-optimism can, and will, hide design
problems, especially at the abstract RTL level of verification. At best, these design
bugs are not caught until late in the design cycle during gate-level simulations or
when other low-level analysis tools are used. At worst, design ambiguities that were
hidden by X-optimistic simulation might not be discovered until the design has been
implemented in actual silicon.
Several X-optimistic SystemVerilog constructs are discussed in more detail in this
section.
y
b
sel
Table C-l shows the simulation results for an X-optimistic if...else when the con
trol expression (sel) is unknown, compared to the simulation behavior of MUX and
NAND implementations and actual silicon behavior.
simulation behavior
actual silicon
sel a b if...else MUX NAND behavior
RTL gate gates
X 0 0 0 0 0 0
X 0 1 1 X X Oor 1
X 1 0 0 X X Oor 1
X 1 1 1 1 X 1
Table C-l: if...else versus gate-level X propagation
Some important things to note from this table are:
• For all rows, the if ...else statement propagates a known value instead of the X
value of sel. This X-optimistic behavior could hide error conditions in the design.
• For rows 2 and 3, the X-optimistic if ...else behavior only matches one of the pos
sible values that could occur in actual silicon. The other possible value is not prop
agated and therefore the design is not verified with that other possible value.
408 RTL Modeling with SystemVerilog for Simulation and Synthesis
Later sections of this paper show several ways to detect problems with i f condi
tions, so that design bugs of this nature do not remain hidden by an optimistic X.
The control value of a c a s e statement is referred to as the case expression. The val
ues to which the control value is compared are referred to as case items.
a l w a y s comb b e g i n
case (sel) // s e l is the case ex p ressio n
l ' b l : y = a; // l ' b l i s a case item
1'bO : y = b; // 1 'bO i s a c a s e i t e m v a l u e
endcase
end
Example 7: case statement X-optimism
Functionally, c a s e and i f . . . e l s e represent similar logic. However, SystemVer-
ilog’s X-optimistic behavior for a c a s e statement without a d e f a u l t branch is very
different than an i f . . . e l s e decision when the select control is unknown, as is shown
in Table C-2.
Using casez, some, but not all, of the possibilities of s e l having a bit with an X or
Z value fall through to the default. Since y is not assigned a value in the default
branch, the value of y would not be changed, and would retain its previous value.
The case...inside statement is also X-optimistic, but less so than casex or
casez. With case...inside, only the bits in case items can have mask (don’t care)
bits. Any X or Z bits in the case expression are treated as literal values.
always_comb begin
case (sel) inside
3'bl??: y = a;
3'bOO?: y = b;
3'bOl?: y = c;
default: $error("sel had unexpected value");
endcase
end
Example 11: case...inside statement X-optimism
Using case...inside, the values each case item represents are:
• 3 ' b l ? ? matches sel values of:
100, 101, 110, 111,
10X, 11X, 1X0, 1X1, 1XX,
10Z, 11Z, 1Z0, 1Z1, 1ZZ,
1XZ, 1ZX
• 3 'b00? matches sel values of:
000, 001,oox, ooz
• 3 'bOl? matches sel values of:
010, Oil, 01X, 01Z
• default matches sel values of:
0X0, 0X1, oxx, oxz,
OZO, OZI, ozz, ozx,
X00, X01, X10, Xll,
Appendix C: X Optimism and X Pessimism in RTL Models 413
All forms of wildcard case statements are X-optimistic, but in different ways. The
c a s e ...in s id e does the best job of modeling actual silicon optimism, but can still
differ from true silicon behavior, and can hide problems with a case expression. Sec
tions C.7 (page 430) and C.9 (page 432) discuss ways to reduce or eliminate this X-
optimism problem with wildcard case statements.
always_latch
if (write && enable) RAM[addr] = data;
Example 13: Array index ambiguity X-optimism
In this example, only the least-significant bit of a d d r is unknown. A pessimistic
approach would have been to write an unknown value into the r a m locations that
might have been affected by this unknown address bit (addresses 0 and 1 in this
example). SystemVerilog’s X-optimistic rule, however, acts as if no write operation
had occurred. This completely hides the fact that the address has a problem, and does
not accurately model silicon behavior.
endmodule: program_counter
Example 14: Program counter with default wire net types
In Example 14, clock, resetN and loadN are input ports, but no data type has
been defined. These signals will all default to wire nets. The signal new_count is
declared as input logic, and will also default to wire (logic only defines that
new_count can have 4-state values, but does not define the data type of
new_count). Conversely, count is declared as output logic. Module output ports
default to a variable of type reg, unless explicitly declared a different data type.
416 RTL Modeling with SystemVerilog for Simulation and Synthesis
(Note: The default data type rules changed between the SystemVerilog-2005 and Sys
temVerilog-2009 standards for when logic is used as part of port declaration.)
Design bugs can easily occur when a mistake is made and a wire net, that was
intended to only have one driver, is unintentionally driven by two sources. Since
wire types support and resolve multiple drivers, simulation will only propagate an X
if the two values are of the same strength and opposing values. Any other combina
tion will resolve to a known value, and hide the fact that there were unintentional
multiple drivers.
clk2
In actual silicon, the internal storage of this flip-flop might power up as either a 0 or
a 1. Whichever value it is, clk2 will change value every second positive edge of
clkl, and give the desired behavior of a divide-by-two. In simulation, however, the
ambiguity of starting as either a 0 or 1 is represented as an X. The pessimistic inverter
will propagate this X to the D input. Each positive edge of clkl will propagate this X
onto Q, which once again feeds back to the input of the inverter. The result is that
clk2 is stuck at an X.
The failure of cl k2 to toggle between 0 and 1 will likely lock up downstream reg
isters that are controlled by clk2. The X-pessimistic simulation will be locked up in
an X state, where actual silicon would not have a problem. This X-pessimism exists at
both the RTL level and at the gate level. The invert operator and the n o t inverter
primitive are both X-pessimistic. An RTL assignment statement, such as Q <= D, and
the typical gate-level flip-flop will both propagate an X when D is an X.
Several overly pessimistic SystemVerilog constructs that can cause simulation
problems are discussed in this section.
always_comb begin
if (sel) y = a;
else
// synthesis tr
translate off
if (!sel)
// synthesis translate on
y = b;
// synthesis translate_off
else y = 'x ;
// synthesis translate on
end
Example 16: if...else statement with X-pessimism and synthesis pragmas
Assuming that sel is only 1-bit wide, the if (sel) will evaluate as true if, and
only if, sel is 1. The first else branch is taken if sel is 0, X or Z (X-optimistic).
This first else branch then tests for if (! sel), which will evaluate as true if, and
only if, sel is 0. If sel is X or Z, the last else branch will be taken. This last branch
assigns y to X, thus propagating the ambiguity of sel. This if...else statement is
now X-pessimistic, propagating X values rather than known values when there is a
problem with the select condition.
Note that the additional code to make the if...else decision be X-pessimistic
might not yield optimal synthesis results. Therefore, the additional checking must be
hidden from synthesis, using either conditional compilation ('ifdef commands) or
synthesis “pragmas”. A pragma is a tool-specific command hidden within a comment
or attribute. The synthesis pragma is ignored by simulation, but tells synthesis com
pilers to skip over any code that should not be synthesized.
• If the condition evaluates as true (any bit is a 1), the operator returns the value of
expression!.
420 RTL Modeling with SystemVerilog for Simulation and Synthesis
• If the condition evaluates as false (all bits are 0), the operator returns the value of
expression2.
• If the condition evaluates to unknown, the operator does a bit-by-bit comparison of
the values of expression 1 and expression2. For each bit position, if that bit is 0 in
both expressions, then a 0 is returned for that bit. If both bits are 1, a 1 is returned.
If the corresponding bits in each expression are different, or Z, or X, then an X is
returned for that bit.
The following example and table compare the X-optimistic behavior of if ...else,
with a pessimistic if ...else, the mixed-optimism conditional operator, and actual sil
icon. The table is based on all signals being 1-bit wide.
always_comb begin // X-optimistic if...else
if (sel) yl = a;
else yl = b;
end
optimistic pessimistic
sel a b 9 .
• • silicon
if...else if...else
X 0 0 0 X 0 0
X 0 1 1 X X Oor 1
X 1 0 0 X X Oor 1
X 1 1 1 X 1 1
Table C-6: Conditional operator X propagation compared to optimistic if...else and
pessimistic if...else
As can be seen in this table, the conditional operator represents a mix of X-opti-
mism and X-pessimism, and more accurately represents the ambiguities of actual sili
con behavior, given an uncertain selection condition. For this reason, Turpin [1]
recommends using the conditional operator instead if ...else in combinational logic.
This author does not agree with Turpin’s coding guideline for two reasons. First,
complex decode logic often involves multiple levels of decisions. Coding with
if ...else and case statements can help make complex logic more readable, easier to
debug, and easier to reuse. Coding the same logic with nested levels of conditional
Appendix C: X Optimism and X Pessimism in RTL Models 421
operators obfuscates code and adds a risk of coding errors. Furthermore, synthesis
compilers might not permit or properly translate nested conditional operators.
A second reason the conditional operator should not always be used in place of
if...else is when the condition is based on a signal or expression that is more than
one bit wide. The condition is evaluated as a true/false expression. In a multi-bit
value, if any bit is 1, the condition is considered to be true, even if some other bits are
X or Z. The conditional operator will optimistically return the value of expression 1,
rather than propagate an X.
Sections C.7 (page 430) and C.9 (page 432) show ways to keep the benefits of
if...else and case statements, and also have the benefit of the conditional opera
tor’s balance of X-optimism and X-pessimism.
end
Example 21: Logical operators with X-pessimism
The return from the expression (a > b) for the values shown in this example will
be l'bx. In this simple code snippet, it is obvious that the value of a is greater than the
value of b, regardless of the actual value of the least-significant bit of b. Actual sili
con would not have an ambiguous result.
Arithmetic operations are also X-pessimistic, and will propagate an X if there is
any ambiguity of the input values.
logic [3:0] a = 4'b0000;
logic [3:0] b = 4'b001z;
logic [3:0] sum;
always_comb begin
sum = a + b;
end
Example 22: Arithmetic operator with X-pessimism
With arithmetic operators, all bits of the operation result are X, which can be overly
pessimistic. In this example, sum will have a value of 4'bxxxx. In silicon, only the
least-significant bit is affected by the ambiguous bit in b. The silicon result would be
either 4' bOOlO or 4' bOOll. A more accurate representation of the silicon ambiguity
would be: 4'b00lx.
Arithmetic operations are X-pessimistic, even when the result in silicon would not
have any ambiguity at all.
logic [3:0] b = 4'b001x;
logic [4:0] product;
always_comb begin
product = b * 0; // multiply b with 0
end
Example 23: Overly pessimistic arithmetic operation
In this example, product will have an overly pessimistic value of 4 ' bxxxx, but in
silicon (and in normal arithmetic) zero times anything, even an ambiguous value,
would result in 0.
Since each input can have 4 values and 12 transitions to and from those values,
these truth tables can be quite large. By default, UDPs are pessimistic — any unde
fined input value combination that is not explicitly defined in the table will default to
a result of X. Library developers often take advantage of this default to reduce the
number of lines that need to be defined in the truth table. It is not uncommon for a
UDP to only define output values for all possible 2-state combinations and transi
tions. Any X or Z values on an input, or transitions to and from X or Z, will default to
propagating an X on the UDP output.
An inadvertent omission from the UDP truth table will also propagate an X value.
This pessimism might be great for finding bugs in the library, but is often a source of
frustration for engineers using a library from a 3rd party vendor.
always_comb begin
out = data[i]; // variable bit select of data
end
Example 24: Ambiguous bit select with X-pessimism
The ambiguity of the value of i means that o u t will be X. This pessimistic rule
means that problems with an index will propagate to the result of the operation. Since
the values of d a ta and i could change during simulation, this pessimism will be sure
to propagate an X whenever an ambiguous value of i might occur.
This X-pessimistic rule does not accurately represent silicon behavior, however.
There are times when an ambiguity in the index can still result in a known value. With
the values shown in Example 24, the ambiguous value of i would either select bit 0 or
2. In either case, o u t would receive the deterministic value of 0 in actual silicon.
always_comb begin
out = data << i; // shift of data
end
Example 25: Ambiguous shift operation with X-pessimism
The result of this shift operation is 8 'bxxxxxxxx. As with other pessimistic opera
tions, this will be sure to propagate an X result whenever the exact number of times to
shift is uncertain. Setting all bits of the result to X, however, can be overly pessimis
tic, and not represent actual silicon behavior, where only some bits of the result might
be ambiguous, instead of all bits. Given the values in Example 25, data is either
shifted 0 times or 2 times. The two possible results are 8'bl0001000 and
8'bOOlOOOOO. If only the ambiguous bits of these two results were set to X, the X-
optimistic value of out would be 8'bXOXOXOOO instead of an overly pessimistic
8'bxxxxxxxx.
There have been arguments made that it is better to just eliminate logic X rather
than to deal with the hazards and difficulties of X-optimism and X-pessimism (see
[1], [5], [6]). Some SystemVerilog simulators offer a 2-state simulation mode, typi
cally enabled using an invocation option such as -2state or +2state.
Using 2-state simulation can offer several advantages:
• Eliminates uninitialized register and X propagation problems (the clock divider X
lock-up problem shown in section C.4 (page 417) would not occur in a 2-state sim
ulation).
• Eliminates certain potential mismatches between RTL simulation and how synthe
sis interprets that code, because synthesis only considers 2-state values in most
RTL modeling constructs.
• RTL and gate-level simulation behaves more like actual silicon, since silicon
always has a 0 or 1, and never an X.
Appendix C: X Optimism and X Pessimism in RTL Models 427
• Reduces the simulation virtual memory footprint; Encoding 4-date values for each
bit, along with strength values for net types, requires much more memory than just
storing simple 2-state values.
• Improves simulation run-time performance, since 4-state encoding, decoding, and
operations do not need to be performed.
On the other hand, there are several hazards to consider when only 2-state values
are simulated.
First, a functional bug in the RTL or gate-level code might go undetected. Logic X
is a simulator’s way of indicating that it cannot accurately predict what actual silicon
would do under certain conditions. When X values occur in simulation, it is an indica
tion that there might be a design problem. Without X values, verification and detec
tion of possible design ambiguities can be more difficult.
A second hazard of 2-state simulation values is that, since there is no X value, sim
ulators must choose either a 0 or a 1 when situations occur where the simulator cannot
accurately predict actual silicon behavior. The value that is chosen only represents
one of the conditions that might occur in silicon. This means the design is verified for
that one value, and leaves any other possible values untested. That is dangerous!
Some simulators handle this hazard by simulating both values in parallel and merging
the results of the parallel threads. This concept is discussed in more detail in section
C.7 (page 430).
A third hazard is that all design registers, clock dividers, and input ports begin sim
ulation with a value of 0 or 1 instead of X. Silicon would also power up with values of
0 or 1, but are they the same values that were simulated? Cummings and Bening [6]
suggest that the most effective 2-state verification is performed by running hundreds
of simulations with each register bit beginning with a random 2-state value. Cum
mings and Bening [6] also note that, at the time the paper was written, a preferred way
for handling seeding and repeatability of randomized 2-state register initialization
was patented by Hewlett-Packard, and might not be available for public use.
A fourth hazard is that verification cannot check for design problems using a logic
X or Z. The following two verification snippets will not work with 2-state simula
tions:
assert (ena == 0 && data === 'Z)
else $error("Data bus failed to tri-state");
case ( {sell,sel2} )
2'b01: result = a + b;
2'blO: result = a - b;
2'bll: result = a * b;
default: result = 'X;
endcase
Example 27: Assigning 4-state values in 2-state simulation
Synthesis compilers treat assignment of a logic X value as a don’t care assignment,
meaning the design engineer does not care if silicon sees a logic 0 or a logic 1 for each
bit of the assignment. In a 2-state simulation, the simulator must convert each bit of
the X assignment value to either a 0 or a 1. The specific value would be determined
by the simulator, since 2-state simulation is a feature of the simulator and not the lan
guage. There is a high probability that the values used in simulation and the values
that occur in actual silicon will not be the same. In theory, this should not matter,
since by assigning a logic X, the engineer has indicated that the actual value is a
“don’t care”. The hazard is that, without X propagation, this theory is left unproven in
2-state simulation.
The original Verilog language only provided 4-state data types. The only way to
achieve the benefits of 2-state simulation was to use proprietary options provided by
simulators, as discussed in the previous section. These proprietary 2-state algorithms
do not work the same way with each simulator. 2-state simulation modes also make it
difficult to mix 2-state simulation in one part of a design and 4-state simulation in
other parts of the design.
SystemVerilog improves on the original Verilog language by providing a standard
way to handle 2-state simulations. Several SystemVerilog variable types only store 2-
state values: bit, byte, shortint, int, and longint. SystemVerilog-2012 adds
the ability to have user-defined 2-state net types, as well.
Using these 2-state data types has two important advantages of simulator-specific
2-state simulation modes:
• All simulators follow the same semantic rules for what value to use in ambiguous
conditions (such as power-up).
• It is easy to mix 2-state and 4-state within a design, which allows engineers to
select the appropriate type for each design or verification block.
The uninitialized value of 2-state variables is 0. This can help prevent blocks of
design logic from getting stuck in a logic X state at the start of simulation, as dis
cussed in section C.2.2 (page 402) earlier in this paper. The clock-divider circuit that
was described at the beginning of section C.4 (page 417) will work fine if the flip-flop
storage is modeled as a 2-state type.
Appendix C: X Optimism and X Pessimism in RTL Models 429
Having all variables begin with a logic 0 does not accurately mimic silicon behav
ior, however, where each bit of each register can power-up to either 0 or 1. When all
variables start with a value of 0, only one extreme and unlikely hardware condition is
verified. Bening [5] suggests that simulation should begin with random values for all
bits in all registers, and that hundreds of simulations with different seed values should
be run, in order to ensure that silicon will function correctly at power-up under many
different conditions.
The ability to declare nets and variables that use either 2-state or 4-state value sets
makes it possible to freely mix 2-state and 4-state within a simulation. Engineers can
choose the benefits of 2-state performance in appropriate places within a design or
testbench, and choose the benefits of 4-state simulation where greater accuracy is
required.
SystemVerilog defines a standard rule for mapping 4-state values to 2-state values.
The rule is simple. When a 4-state value is assigned to a 2-state net or variable, any
bits that are X or Z are converted to 0. This simplistic rule eliminates X values, but
does not accurately mimic silicon behavior where each ambiguous bit might be either
a 0 or a 1, rather than always 0.
endmodule: cpu
Example 28: Program counter with unused inputs,
2-state data types
The program counter in this example is loadable, using an active-low loadN con
trol. The CPU model has an instance of the program counter, but does not use the
loadable new_count input or loadN control. Since they are not used, these inputs are
left unconnected, which is probably an inadvertent design bug! With 2-state data
types, however, the unconnected inputs will have a constant value of 0, which means
the statement
if (!loadN) count <= new_count;
will always evaluate as true, and the program counter will be locked in the load
state, rather than incrementing on each clock edge.
In this small example, this bug would be easy to find. Imagine, though, a similar
bug in a huge ASIC or FPGA design. Simple mistakes that are hidden by not having a
logic X show up in simulation can become very difficult to find. Worse, the symptom
of having a logic 0, instead of a logic X, might make a design bug appear to be work
ing at the RTL level, and not show up until gate-level simulations are run. (And what
if your team doesn’t do gate-level simulations?)
After having a 2-state data type hide a design error or cause bizarre simulation
results in a large, complex design, you too might feel, as the author does, that “I ’m
still in love with my X! ”
The previous sections in this paper have shown that SystemVerilog can sometimes
be overly optimistic, and at other times overly pessimistic in how logic X is propa
gated, and that 2-state simulations and data types can hide design problems by com
pletely eliminating Xs. Can a balance between these two extremes be found by
breaking the IEEE 1800 SystemVerilog standard X propagation rules and simulating
with a different algorithm?
Some simulators provide proprietary invocation options to begin simulation with
random variable values, instead of with X values. Using simulator-specific options
can accomplish Bening’s recommended approach of randomly initializing all regis
ters using a different seed [5]. Since these options are not part of the SystemVerilog
language, however, the capability is not available on every simulator and does not
work the same way on simulators that have the feature.
Some SystemVerilog simulators offer a way to reduce X-optimism in RTL simula
tion by using a more pessimistic, non-standard algorithm. For example, the Synopsys
Appendix C: X Optimism and X Pessimism in RTL Models 431
VCS “-xprop” [11] simulation option causes VCS to use simulator-specific X propa
gation rules for i f . . . e l s e and c a s e decision statements and p o s e d g e or n e g e d g e
edge sensitivity. This non-standard approach tries to find a balance between X-opti-
mism and X-pessimism. See Evans, Yam and Forward [12] and Greene, Salz and
Booth [13] for more information on— and experience with—using proprietary X-
propagation rules to change SystemVerilog’s X-optimism and X-pessimism behavior.
One concern with proprietary X propagation rules is that their purpose is to ensure
that design bugs will propagate downstream from the cause of the problem, so that the
bug will be detected instead of hidden. This then requires tracing the cause of an X
back through many lines of code, branching statements, and clock cycles to find the
original cause of the problem. Though most simulators provide powerful debug tools
for tracing back X values, the process can still be tedious and time consuming.
Another concern is the risk of false failures, by making simulation more X-pessi-
mistic. Finding a balance between X-optimism and being overly pessimistic can be
good, but, like the ? : conditional operator, will not always perfectly match silicon
behavior (see section C.4.2, page 419). There might still be times when this balance
of X-optimism and X-pessimism can result in false failures. At best, these false fail
ures can consume significant project man-hours to determine that there is no actual
design problem. Worse — and very possible — these false failures could potentially
cause problems with simulation locking up in X states, as described in section C.2.2
(page 402).
There have been proposals to modify System Verilog’s X-optimism and X-pessi
mism rules in some future version of SystemVerilog. If readers of this paper feel these
enhancements would be important for their projects, they should pressure their EDA
vendors to push for these enhancements in the next SystemVerilog standard.
One of the X-optimism issues presented in this paper is that wildcard “don’t care”
bits in c a s e x , c a s e z and c a s e . . . i n s i d e statements mask out all 4 possible 4-state
values, causing unknown bits in case expressions to be treated as don’t care values.
Turpin [1] proposed adding the ability to specify 2-state wildcard “don’t care” val
ues using an asterisk (instead of X, Z or ?), as follows:
always comb begin
case (sel) inside
3'bl**: y = a; // matches 100, 101,
3'b00* : y = b; // matches 000, 001
3'bOl* : y = c; // matches 010, Oil
default: y = 'x;
endcase
end
Example 29: Proposed case...inside with 2-state don’t cares
432 RTL Modeling with SystemVerilog for Simulation and Synthesis
Let’s face it, when an X shows up, trouble is sure to follow! Rather than having X
problems propagate through countless lines of code, decision branches, and clock
cycles, it would be much better to detect an X the moment it occurs. Detecting when
an X first appears solves the problems o f both X-optimism and X-pessimism!
Appendix C: X Optimism and X Pessimism in RTL Models 433
if (sel) y = a;
else y = b;
end
Example 31: if...else with X-trap assertion
This is the same i f . . . e l s e example that has presented in previous sections, but
with an added assertion to validate the value of sel each time it is evaluated.
Without the assertion, this simple i f . . . e l s e decision has several potential X haz
ards, as was discussed in sections C.3.1 (page 406) and C.4.1 (page 418). Adding an
immediate assertion to verify i f conditions is simple to do, and avoids all of these
hazards. A problem with the i f condition is detected when and where the problem
occurs, rather than hoping that propagating an X will make it visible sometime, some
434 RTL Modeling with SystemVerilog for Simulation and Synthesis
where. Assert statements are ignored by synthesis, so no code has to be hidden from
synthesis compilers.
The author recommends that i f statements that are conditioned on a module input
port have an immediate assertion to validate the i f condition. A text-substitution
macro could be defined to simplify using this assertion in many places.
'define assert_condition (cond) \
assert (Acond === l'bx) \
else $error("%m, ifcond = X")
always_comb begin
'assert_condition(sel)
if (sel) y = a;
else y = b;
end
always_comb begin
'assert_condition({a,b,c,d})
'assert condition(sel)
case (sel)
2 ’b00 : out = a;
2 ’b01 : out = b;
2 ’b01 : out = c;
2 ’b01 : out = d;
endcase
end
Example 32: Using an X-trap assertion macro
SystemVerilog assertions are ignored by synthesis compilers, and therefore can be
placed directly in RTL code without having to hide them from synthesis using condi
tional compilation or pragmas. It is also possible to place the assertions in a separate
file and bind them to the design module using SystemVerilog’s binding mechanism.
This section presents a few coding guidelines that help to appropriately use and
benefit from SystemVerilog’s X-optimism and X-pessimism, and minimize the poten
tial hazards associated with hiding or propagating an X.
Your X can be your best friend. X values indicate that there is some sort of ambigu
ity in the design. Eliminating X values using 2-state data types does not eliminate the
design ambiguity. Sutherland HDL recommends using 4-state data types in all places,
with two exceptions:
• The iterator variable in f or-loops is declared as an in t 2-state variable.
Appendix C: X Optimism and X Pessimism in RTL Models 435
• Verification stimulus variables that will (or might) have randomly generated values
are declared as 2-state types.
This coding guideline uses 2-state types only for variables that will never be built in
silicon, and therefore do not need to reflect an ambiguous condition that might exist
in silicon.
There is one other place where 2-state types might be appropriate, which is the stor
age of large memory arrays. Using 2-state types for large RAM models can substan
tially reduce the virtual memory needed to simulate the memory. This savings comes
at a risk, however. Should the design fail to correctly write or read from a memory
location, there will be no X values to indicate there was a problem. To help minimize
that risk, it is simple to model RAM storage, so that it can be configured to simulate
as either 2-state storage (using the b i t type) or 4-state storage (using the l o g i c
type).
Section C.2.2 (page 402) discussed the problems associated with design variables,
especially those used to model hardware registers, beginning simulation with X val
ues. Section C.5 (page 426) discussed using proprietary simulation options to initial
ize register variables with random values. If that feature is available, it should be
used!
Another way to randomly initialize registers is using the UVM Register Abstraction
Layer (RAL). UVM is a standard, and is well supported in major SystemVerilog sim
ulators. A UVM testbench and RAL are not trivial to set up, but can provide a consis
tent way to randomly initialize registers. The advantage of using UVM to initialize
registers is that it will work with all major simulators.
All RTL models intended for synthesis should have System Verilog assertions detect
X values on i f . . . e l s e and c a s e select conditions. Other critical signals can also have
X-detect assertions on them. Design engineers should be responsible for adding these
assertions. Section C.9 (page 432) showed how easy it is to add X-detecting asser
tions.
436 RTL Modeling with SystemVerilog for Simulation and Synthesis
C .ll Conclusions
This paper has discussed the benefits and hazards of X values in simulation. Some
times SystemVerilog is optimistic about how X values affect design functionality, and
sometimes SystemVerilog is pessimistic.
X-optimism has been defined in this paper as any time simulation converts an X
value on the input to an operation or logic gate into a 0 or 1 on the output. Some key
points that have been discussed include:
• X-optimism can accurately represent real silicon behavior when an ambiguous con
dition occurs. For example, if one input to an AND gate is uncertain, but the other
input is 0, the output of the gate will be 0. SystemVerilog X-optimistic AND opera
tor and AND primitive behave the same way.
• X-optimism is essential for some simulation conditions, such as the synchronous
reset circuit shown in Section C.3 (page 405).
• SystemVerilog can be overly optimistic, meaning an X propagates as a 0 or 1 in
simulation when actual silicon is still ambiguous. Over optimism can lead to only
one of the possible silicon values being verified.
• In all circumstances, X-optimism has the risk of hiding design bugs. A ambiguous
condition that causes an X deep in the design might not propagate as an X to a point
in the design that is being observed by verification. The value that does propagate
might appear to be a good value.
X-pessimism is defined in this paper as any time simulation passes an X on an input
through to the output. X-pessimism can be desirable or undesirable.
• X-pessimism will not hide design bugs the way X-optimism might. An ambiguous
condition deep within a design will propagate as an X value to points that verifica
tion is observing.
• X-pessimism can lead to false failures, where actual silicon will function correctly,
such as if one input to an AND gate is an X, but the other input is 0. A false X
might need to be traced back through many levels of logic and clock cycles before
determining that there is not an actual problem.
• X-pessimism can lead to simulation locking up with X values, where actual simula
tion will function correctly, even if the logic values in silicon are ambiguous. The
clock divider shown in section C.4 (page 417) is an example of this.
It might be tempting to use 2-state data types or 2-state simulation modes to elimi
nate the hazards of an X. Although there are some advantages to 2-state simulation,
those advantages do not outweigh the benefits of 4-state simulation. 2-state simula
tion will hide all design ambiguities, and often not simulate with the same values that
actual silicon would have. 2-state data types should only be used for generating ran
dom stimulus values. Design code should use 4-state types.
Appendix C: X Optimism and X Pessimism in RTL Models 437
The best way to handle X problems is to detect the X as close to its original source
as possible. This paper has shown how SystemVerilog assertions can be used to easily
detect and isolate design bugs that result in an X. With early detection, it is not neces
sary to rely on X propagation to detect design problems.
All engineers should be in love with their XI X values indicate that there might be
some ambiguity in an actual silicon implementation of intended functionality.
The author appreciates the contributions Don Mills and Shalom Bresticker have
made to this paper. Don provided several of the examples and coding recommenda
tions in this paper, and provided valuable suggestions regarding the paper content.
Shalom made an in-depth technical review of the paper draft and provided detailed
comments on how to improve the content of the paper.
The author also expresses gratitude to his one-and-only wife of more than 30 years
(she will never be an “X”), who, despite the title of the paper, painstakingly reviewed
the paper for grammar, punctuation, and sentence structure.
438 RTL Modeling with SystemVerilog for Simulation and Synthesis
C.13 References
[1] Turpin, “The dangers of living with an X,” Synopsys Users Group Conference (SNUG)
Boston, 2003.
[2] Mills, “Being assertive with your X (SystemVerilog assertions for dummies),” Synopsys
Users Group Conference (SNUG) San Jose, 2004.
[3] “P I800-2012/D6 Draft Standard for SystemVerilog—Unified Hardware Design, Specifica
tion, and Verification Language (re-ballot draft)”, IEEE, Pascataway, New Jersey. Copy
right 2012. ISBN: (not yet assigned).
[4] Merriam-Webster online dictionary, http://www.merriam-webster.com/, accessed 11/20/
2012.
[5] Bening, “A two-state methodology for RTL logic simulation,” Design Automation Confer
ence (DAC) 1999.
[6] Cummings and Bening, “SystemVerilog 2-state simulation performance and verification
advantages,” Synopsys Users Group Conference (SNUG) Boston, 2004.
[7] Piper and Vimjam, “X-propagation woes: masking bugs at RTL and unnecessary debug at
the netlist,” Design and Verification Conference (DVcon) 2012.
[8] Weber and Pecor, “All My X values Come From Texas...Not!,” Synopsys Users Group
Conference (SNUG) Boston, 2004.
[9] Turpin, “Solving Verilog X-issues by sequentially comparing a design with itself,” Synop
sys Users Group Conference (SNUG) Boston, 2005.
[10] Chou, Chang and Kuo, “Handling don’t-care conditions in high-level synthesis and appli
cation for reducing initialized registers,” Design Automation Conference (DAC) 2009.
[11] Greene, “Getting X Propagation Under Control”, a tutorial presented by Synopsys, Synop
sys Users Group Conference (SNUG) San Jose, 2012.
[12] Evans, Yam and Forward, “X-Propagation: An Alternative to Gate Level Simulation”, Syn
opsys Users Group Conference (SNUG) San Jose, 2012.
[13] Greene, Salz and Booth, “X-Optimism Elimination during RTL Verification”, Synopsys
Users Group Conference (SNUG) San Jose, 2012.
[14] Browy and K. Chang, “SimXACT delivers precise gate-level simulation accuracy when
unknowns exist,” White paper, http://www.avery-design.com/files/docs/
SimXACT_WP.pdf, accessed 11/12/2012.
[15] Cummings, “SystemVerilog 2012 new proposals for design engineers,” presentation at
SystemVerilog Standard Working Group meeting, 2010, http://www.eda.org/sv-ieeel800/
Meetings/2010/February/Presentations/Cliff%20Cummings%20Presentation.pdf, accessed
11/ 12/ 2012 .
[16] Mills, “Yet another latch and gotchas paper” Synopsys Users Group Conference (SNUG)
San Jose, 2012.
439
Appendix D
Additional Resources
This appendix lists some additional resources that are closely related to the topics
presented in this book, and might be of particular interest to RTL design engineers.
Books. Two books that are important companions to this book are:
• IEEE Std 1800-2012, SystemVerilog Language Reference Manual (LRM). Offi
cially titled, “IEEE Standard fo r System Verilog— Unified Hardware Design, Speci
fication, and Verification Language”. Copyright 2013 by The Institute of Electrical
and Electronics Engineers, Inc. ISBN 978-0-7381-8110-3 (PDF) and 978-0-7381-
8111-0 (print).
This is the official standard for the syntax and semantics of the SystemVerilog
language. The book is written primarily for companies that implement Electronic
Design Automation (EDA) tools, such as simulators and synthesis compilers.
Note that standard does not distinguish between what aspects of the language are
for design and what aspects are for verification. Available to download at: https:/
/standards, ieee.org/getieee/1800/download/1800-2012.pdf.
• “System Verilog fo r Verification, Third Edition ”, by Chris Spear and Greg Turn-
bush. Copyright 2012, Springer, New York, New York. ISBN 978-1-4614-0715-7.
Presents the numerous verification constructs in SystemVerilog, which are not
covered in this book. For more information, refer to the publisher’s web site:
http://www.springer, com/engineering/circuits+%2 6+systems/book/978-1-4614-
0714-0.
Some additional books that might be of interest include:
• “Constraining Designs fo r Synthesis and Timing Analysis ”, by Sridhar Gangad-
haran and Sanjay Churiwala. Copyright 2013, Springer, New York, New York.
ISBN 978-1-4614-3268-5.
A guide to specifying timing constraints for synthesis, static timing analysis and
placement and routing using the industry format of Synopsys Design Constraints
(SDC). Chapters on synthesis constraints, multi-clock boundaries and clock-
domain crossing are particularly germaine to the RTL synthesis coding styles
presented in this book. For more information, refer to the publisher’s web site:
http://www.springer, com/engineering/circuits+%2 6+systems/book/978-1-4614-
3268-5.
440 RTL Modeling with SystemVerilog for Simulation and Synthesis
Conference papers. Following is a list of a few conference papers that explore some
of the topics discussed in this book, and which might be of particular interest to RTL
design engineers.
NOTE
Examples shown in conference papers might not use up-to-date best practice
coding styles. Some of the papers cited in this appendix were written before
SystemVerilog extended the original Verilog language. The capabilities of
both the language and synthesis compilers have evolved significantly since
some of these older papers were written. The engineering principles dis
cussed in these papers are still relevant and useful, but some examples and
recommended coding styles might be obsolete. It is left to the reader to
rewrite these examples using the best-practice SystemVerilog coding styles
presented in this book.
1. “Who Put Assertions In My RTL Code? And Why? How RTL Design Engineers
Can Benefit from the Use o f System Verilog Assertions ”, by Stuart Sutherland.
Presented at the 2013 Silicon Valley Synopsys Users Group Conference (SNUG).
Available for download at sutherland-hdl.com.
2. "Synchronization and Metastability", by Steve Golson. Presented at the 2014 Sil
icon Valley Synopsys Users Group Conference (SNUG). Available at trilo-
byte.com.
Appendix D: Additional Resources 441
3. “Yet Another Latch and Gotchas Paper ”, by Don Mills. Presented at the 2012
Silicon Valley Synopsys Users Group Conference (SNUG). Available at lcdm-
eng.com.
4. “RTL Coding Styles That Yield Simulation and Synthesis Mismatches ”, by Don
Mills and Clifford Cummings. Presented at the 2001 Europe Synopsys Users
Group (SNUG). Available at lcdm-eng.com or sunburst-design.com.
5. “Asynchronous & Synchronous Reset Design Techniques — Part Deux ”, by Clif
ford Cummings, Don Mills and Steve Golson. Presented at the 2003 Boston Syn
opsys Users Group Conference (SNUG). Available at lcdm-eng.com, sunburst-
design.com, or trilobyte.com.
6. “ full_case parallel_case\ the Evil Twins o f Verilog Synthesis”, by Clifford
Cummings. Presented at the 1999 Boston Synopsys Users Group Conference
(SNUG). Available at sunburst-design.com.
7. “Language Wars in the 21st Century: Verilog versus VHDL - Revisited”, by
Steve Golson and Leah Clark. Presented at the 2016 Silicon Valley Synopsys
Users Group Conference (SNUG). Available at trilobyte.com.
In addition to these papers, Stuart Sutherland, the author of this book, has authored
or co-authored many papers on SystemVerilog topics (and on the original Verilog lan
guage). These papers and presentation slides are available at sutherland-hdl.com.
443
Index