0% found this document useful (0 votes)
141 views60 pages

STA Chapter-10 by Jay Bhaskar

Uploaded by

mbalaji00000
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
141 views60 pages

STA Chapter-10 by Jay Bhaskar

Uploaded by

mbalaji00000
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 60

Chapter 10 : Robust Verification

• On Chip Variations • Power Management


• Time Borrowing • Sign Off Methodology
• Data To Data Checks • Statistical STA
• Paths Failing Timing
• Non Sequential Checks
• Validating Timing Constraints
• Clock Gating Checks
ONCHIP VARIATIONS
• The process and environmental parameters may not be uniform across different portions of
the die. Due to process variations, identical MOStransistors in different portions of the die
may not have similar characteristics. These differences are due to process variations within
the die.
1. local process variations
2. Global process variations

Besides the variations in the process parameters, different portions of the


design may also see different power supply voltage and temperature.
i. IR drop variation along the die area affecting the local power supply.
ii. Voltage threshold variation of the PMOS or the NMOS device.
iii. Channel length variation of the PMOS or the NMOS device.
iv. Temperature variations due to local hot spots.
• Modeling of OCV is not intended to model the entire span of the PVT variations
possible from wafer to wafer but to model the PVT variations that are possible
locally within a single die.
• The cell delays or wire delays or both can be derated to model the effect of OCV.
• The worst condition for setup check occurs when the launch clock path and the
data path have the OCV conditions which result in the largest delays, while the
capture clock path has the OCV conditions which result in the smallest delays.
LaunchClockPath + MaxDataPath <= ClockPeriod +
CaptureClockPath- Tsetup_UFF1
This implies that the minimum clock period = LaunchClockPath +
MaxDataPath- CaptureClockPath + Tsetup_UFF1
From the figure,
LaunchClockPath = 1.2 + 0.8 = 2.0
MaxDataPath = 5.2
CaptureClockPath = 1.2 + 0.86 = 2.06
Tsetup_UFF1 = 0.35
This results in a minimum clock period of:
2.0 + 5.2– 2.06 + 0.35 = 5.49ns
• Cell and net delays can be derated using the set_timing_derate
specification.
set_timing_derate-early 0.8
set_timing_derate-late 1.1
• The-cell_delay and the-net_delay options can be used in the set_timing_derate
specification
set_timing_derate-cell_delay-early 0.9
set_timing_derate-cell_delay-late 1.0
set_timing_derate-net_delay-early 1.0
set_timing_derate-net_delay-late 1.2
set_timing_derate-early 0.95-clock
set_timing_derate-late 1.05-data
• We now apply the following derating to the example ,
set_timing_derate-early 0.9
set_timing_derate-late 1.2
set_timing_derate-late 1.1-cell_check

• With these derating values, we get the following for setup check:
LaunchClockPath = 2.0 * 1.2 = 2.4
MaxDataPath = 5.2 * 1.2 = 6.24
CaptureClockPath = 2.06 * 0.9 = 1.854
Tsetup_UFF1 = 0.35 * 1.1 = 0.385

• This results in a minimum clock period of:


2.4 + 6.24– 1.854 + 0.385 = 7.171ns
• The pessimism caused by different derating factors applied on the common part
of the clock tree is called Common Path Pessimism (CPP) which should be
removed during the analysis. CPPR, which stands for Common Path Pessimism
Removal, is often listed as a separate item in a path report. It is also labeled as
Clock Reconvergence Pessimism Removal (CRPR).
CPP = LatestArrivalTime@CommonPoint-
EarliestArrivalTime@CommonPoint.
For the example of
LatestArrivalTime@CommonPoint = 1.2 * 1.2 = 1.44
EarliestArrivalTime@CommonPoint = 1.2 * 0.9 = 1.08
This implies a CPP of: 1.44- 1.08 = 0.36ns
With the CPP correction, this results in a minimum clock period of:
7.171- 0.36 = 6.811ns
• Analysis with OCV at Worst PVT Condition :
• If the setup timing check is being performed at the worst-case PVT condition, no
derating is necessary on the late paths as they are already the worst possible.
set_timing_derate-early 0.9
set_timing_derate-late 1.0
• OCV for Hold Checks :
• We now examine how the derating is done for a hold timing check.
• If the PVT conditions are different along the chip, the worst condition for hold
check occurs when the launch clock path and the data path have OCV conditions
which result in the smallest delays, that is, when we have the earliest
launchclock.
• The hold timing check is specified in the following expression for this example.
LaunchClockPath + MinDataPath- CaptureClockPath- Thold_UFF1 >= 0
• Applying the delay values in the Figure 10-2 to weget: LaunchClockPath =
the expression, we get 0.85 * 0.9 = 0.765
(without applying any derating): MinDataPath =
LaunchClockPath = 0.25 + 0.6 = 0.85 1.7 * 0.9 = 1.53.
MinDataPath = 1.7 CaptureClockPath =
1.00 * 1.2 = 1.2
CaptureClockPath = 0.25 + 0.75 = 1.00
Thold_UFF1 =
Thold_UFF1 = 1.25 1.25 * 0.95 = 1.1875
This implies that the condition is: Common clock path pessimism:
0.85 + 1.7– 1.00- 1.25 = 0.3n >=0 0.25 * (1.2- 0.9) = 0.075
which is true, and thus no hold violation exists. The hold check condition then
becomes:
Applying the following derate specification:
0.765 + 1.53– 1.2- 1.1875 +
set_timing_derate-early 0.9 0.075 =-0.0175ns
set_timing_derate-late 1.2 hold violation
set_timing_derate-early 0.95-cell_check
• Here is a holdtiming check path report for an example design that uses
this derating.
Time Borrowing
• The time barrowing technique, which is also called cycle stealing, occurs
at a latch.
• In a latch, one edge of the clock is called as opening edge, where it
opens the latch so that output of the latch is the same as the data input,
other edge of the clock is called as closing edge, where any change on
the data input is no longer available at the output of the latch.
• Since a latch is transparent when the clock is active, the data can arrive
later than the active clock edge, that is, it can borrow time from the
next cycle.
• If such time is borrowed, the time available for the following stage is
reduced.
• The first rule in timing to a latch is that if the data arrives before the
opening edge of the latch, the behavior is modeled exactly like a flip-
flop. The opening edge captures the data and the same clock edge
launches the data as the start point for the next path.
• The second rule applies when the data signal arrives while the latch is
transparent (between the opening and the closing edge). The output
of the latch, rather than the clock pin, is used as the launch point for
the next stage. The amount of time borrowed by the path ending at
the latch determines the launch time for the next stage.
• A data signal that arrives after the closing edge at the latch is a timing
violation.
• Example of time borrowing using the
active rising edge.
• If data DIN is ready at time A prior to
the latch opening on the rising edge of
CLK at 10ns, the data flows to the
output of the latch as it opens. If data
arrives at time B as shown for DIN
(delayed), it borrows time Tb.
• Figure 10-4 shows the timing regions for
data arrival for positive slack, zero slack,
and negative slack (that is, when a
violation occurs).
• Example of no time borrowing
• Example with time borrowed
• we analyze the timing from ULAT1 to DFF1 when time borrowing occurs.
• Example with timing violation
Data to Data checks
• Set up and hold checks can also be applied between any two
arbitrary data pins, neither of which is a clock.
• One pin is the constrained pin, which acts like a data pin of a flip-
flop, and the second pin is the related pin, which acts like a clock pin
of a flip-flop.
• One important distinction with respect to the setup check of a flip-
flop is that the data to data setup check is performed on the same
edge as the launch edge
• Thus, the data to data setup checks are also referred to as zero-cycle
checks or same-cycle checks.
• A data to data check is specified using the set_data_check constraint. Here are example
SDC specifications.
set_data_check-from SDA-to SCTRL-setup 2.1
set_data_check-from SDA-to SCTRL-hold 1.5

• The setup data check implies that SCTRL should arrive at least 2.1ns prior to the edge
of the related pin SDA. Otherwise it is a data to data setup check violation.
• The hold data check specifies that SCTRL should arrive at least 1.5ns after SDA. If the
constrained signal arrives earlier than this specification, then it is a data to data hold
check violation
• Consider the and cell shown in Figure . We
assume the requirement is to ensure that
PNA arrives 1.8ns before the rising edge of
PREAD and that it should not change for
1.0ns after the rising edge of PREAD.
• In this example, PNA is the constrained pin
and PREAD is the related pin. The required
waveforms are shown in Figure
• Such a requirement can be specified using
a data to data setup and hold check.
• set_data_check-from UAND0/A1-to
UAND0/A2-setup 1.8
• set_data_check-from UAND0/A1-to
UAND0/A2-hold 1.0
• The zero-cycle setup check causes the • In some scenarios, a designer
hold timing check to be different may require the data to data
from other hold check reports- the hold check to be performed
hold check is no longer on the same on the same clock cycle
clock edge. • set_multicycle_path -1 -hold-
to UAND0/A2

The setup time is specified as


data check setup time in the
report This is the hold check report on
same clock edge for both pins
• An alternate way of having the data to data hold check performed in the same cycle is to specify this
as a data to data setup check between the pins in the reverse direction
set_data_check –from UAND0/A2 –to UAND0/A1 –setup 1.0

Non-Sequential Checks
• These checks are applied only to pins within a single cell or macro
• A non sequential check is a check between two pins, neither of which is a clock. One pin is the
constrained pin that acts like data, while the second pin is the related pin and this acts like a clock.
• The check specifies how long the data on the constrained pin must be stable before and after the
change on the related pin this check is specified as part of the cell library specification and no explicit
data to data check constraint is required.
• Difference between the non sequential checks and data to data checks :

Data to Data checks Non sequential checks

• Timing checks between two • Timing checks between two pins


data signals in a design of a cell, defined in the cell
• Defined by the designer library
based on the design needs • the setup and hold values are
and functionality. obtained from the standard cell
library
Clock Gating Checks
• A clock gating check occurs when a gating
signal can control the path of a clock signal
at the logic cell.
• The pin of the logic cell connected to the
clock is called the clock pin and the pin
where the gating signal is connected to is
the gating pin.
• One condition for a clock gating check is
that the clock that goes through the cell
must be used as a clock downstream.
• Another condition for the clock gating
check applies to the gating signal. The
signal at the gating pin of the clock should
not be a clock or if it is clock, it should not
be used as a downstream.
Types of clock gating checks
• Active-high clock gating checks: Occurs when the gating cell has an and or nand function.
• Active-low clock gating checks: Occurs when the gating cell has an or or a nor function.
Active-high clock gating:
As it is an and cell, a high on gating signal UAND0/A opens up the gating cell and allows the clock to
propagate through.
The clock gating check is intended to validate that the gating pin transition does not create an active
edge for the fanout clock.
For positive edge-triggered logic, this implies that the raising signal occurs during the inactive period
of the clock.
Similarly, for negative edge-triggered logic, the falling edge of the gating signal should occur only
when the clock is low.
Active-Low Clock Gating:
• As it is an or cell, a low on gating
signal UOR1/A1 opens up the
gating cell and allows the clock to
propagate through.
• The transitions of gating signal
(rising or falling edges) must occur
only when clock is high. This is
because the gating signal should
not cause an active edge for the
output gated clock.
Clock Gating with a Multiplexer
• A clock gating check at the
multiplexer inputs ensures that
the multiplexer select signal
arrives at the right time to cleanly
switch between MCLK and TCLK.
• For this example, we are
interested in switching to and
from MCLK and assume that TCLK
is low when the select signal
switches.
Clock Gating with Clock Inversion

• Another clock gating example


where the clock to the flip-flop is
inverted and the output of the
flip-flop is the gating signal.
• Since the gating cell is an and
cell, the gating signal must switch
only when the clock signal at the
and cell is low.
The hold check validates whether the data (gating signal) changes before
the falling edge of MCLK at time 10ns.
Power Management
• Managing the power is an important aspect of any design
and how it is implemented.
• The power dissipated in the logic portion of the design is
comprised of leakage power and the active power.
• In general, there are two considerations for managing the
power contributions from the digital logic comprised of
standard cells and memory macros:

• To minimize the total active power of the design.


• To minimize the power dissipation of the design in standby mode.
• Clock Gating
• A flip-flop dissipates power due to clock toggle even when the flip-flop output
does not switch.
• The purpose of clock gating is to minimize this contribution by eliminating the
clock activity at the flip-flop during clock cycles when the flip-flop input is not
active.
• The logic restructuring through clock gating introduces gating of the clock at
the flip-flop pin.

• The clock gating thus ensures that the clock pin of the flip-flop toggles only
when new data is available at its data input.
• Power Gating
• Power gating involves gating off the power supply so that the power to the
inactive blocks can be turned off.
• This procedure is illustrated as a footer (or a header) MOS device is added in
series with the power supply.
• The control signal SLEEP is configured so that the footer (or a header) MOS
device is on during normal operation of the block.
• During inactive (or sleep) mode of the block, the gating MOS device is turned off
which eliminates any active power dissipation in the logic block.
• The footer and header devices introduce a series on resistance to the
power supply. If the value of the on resistance is not small, the IR drop
through the gating MOS device can affect the timing of the cells in the
logic block.
• Multi Vt Cells
• The multi Vt cells are used to tradeoff speed with leakage.
• high Vt cells < standard Vt cells < low Vt cells – leakage
• high Vt cells < standard Vt cells < low Vt cells – speed
• In most designs, the goal is to minimize the total power while achieving
the desired operational speed.
• Implementing a design with only high Vt cells to reduce leakage can
increase the total power even though the leakage contribution may be
reduced.
• High Performance Block with High Activity
• High Performance Block with Low Activity
• Well Bias
• The well bias refers to adding a small voltage bias to the P-well or N-
well used for the NMOS and PMOS devices respectively.
• The leakage power can be reduced significantly if the well
connections have a slight negative bias.
• This means that the P-well for the NMOS devices is connected to a
small negative voltage (such as -0.5V). Similarly, the N-well
connection for the PMOS devices is connected to a voltage above
the power rail (such as Vdd + 0.5V).
• By adding a well bias, the speed of the cell is impacted; however the
leakage is reduced substantially.
• The drawback of using well bias is that it requires additional supply
levels (such as -0.5V and Vdd+0.5V) for the P-well and N-well
connections.
Sign-off Methodology
STA can be run for many different scenarios. The three main variables that
determine a scenario are:
• Parasitics corners
• Operating mode
• PVT corner
Parasitics Interconnect Corners:
Parasitics can be extracted at many corners. These are mostly governed by
the variations in the metal width and metal etch in the manufacturing
process. Some of these are:
• Typical: This refers to the nominal values for interconnect resistance and
capacitance.
• Max C: This refers to the interconnect corner which results in maximum
capacitance. The interconnect resistance is smaller than at typical corner. This corner
results in largest delay for paths with short nets and can be used for max path
analysis.
• Min C: This refers to the interconnect corner which results in minimum capacitance.
The interconnect resistance is larger than at typical corner. This corner results in
smallest delay for paths with short nets and can be used for min path analysis.
• Max RC: This refers to the interconnect corner which maximizes the interconnect RC
product. This typically corresponds to larger etch which reduces the trace width. This
results in largest resistance but corresponds to smaller than typical capacitance. Over
all, this corner has the largest delay for paths with long interconnects and can be used
for max path analysis.
• Min RC: This refers to the interconnect corner which minimizes the interconnect RC
product. This typically corresponds to smaller etch which increases the trace width.
This results in smallest resistance but corresponds to larger than typical capacitance.
Overall, this corner has the smallest path delay for paths with long interconnects and
can be used for min path analysis.
Operating Modes
The operating mode dictates the operation of the design. Various
operating modes for a design can be:
• Functional mode 1 (for e.g. high-speed clocks)
• Functional mode 2 (for e.g. slow clocks)
• Functional mode 3 (for e.g. sleep mode)
• Functional mode 4 (for e.g. debug mode)
• Testmode1(for e.g. scan capture mode)
• Testmode2(for e.g. scan shift mode)
• Testmode3(for e.g. bist mode)
• Testmode4(for e.g. jtag mode)
PVT Corners
The PVT corners dictate at what conditions the STA analysis takes place.
The most common PVT corners are:
• WCS(slow process, low power supply, high temperature)
• BCF(fast process, high power supply, low temperature)
• Typical (typical process, nominal power supply, nominal temperature)
• WCL(worst-case slow at cold- slow process, low power supply, low
temperature)
Multi-mode Multi-Corner Analysis
• Multi-mode multi-corner (MMMC) analysis
refers to performing STA across multiple
operating modes, PVT corners and parasitic
interconnect corners at the same time.
• For example, consider a DUA that has four
operating modes (Normal, Sleep, Scan shift,
Jtag), and is being analyzed at three PVT
corners (WCS, BCF, WCL) and three parasitic
interconnect corners (Typical, Min C, Min
RC).
• There are a total of thirty six possible scenarios at which all timing
checks, such as setup, hold, slew, and clock gating checks can be
performed.
• Running STA for all thirty six scenarios at the same time can be
prohibitive in terms of runtime depending upon the size of the
design.
• It is possible that a scenario may not be necessary as it may be
included within another scenario, or a scenario may not be required.
• Also, it may not be necessary to run all modes in one corner, such as
Scan shift or Jtag modes may not be needed in scenario 5.
Advantages of MMMC
• The advantage of running multi-mode multi-corner STA is of savings in
runtime and complexity in setting up the analysis scripts.
• Additional savings in an MMMC scenario is that the design and
parasitics need to be loaded only once or twice as opposed to loading
these individually multiple times for each mode or corner.
• Multi-mode multi-corner has a bigger advantage in an optimization flow
where the optimization is done across all scenarios such that fixing
timing violations in one scenario does not introduce timing violations in
another scenario.
Statistical STA
• The static timing analysis techniques described thus far are
deterministic since the analysis is based upon fixed delays for all timing
arcs in the design.
• The delay of each arc is computed based upon the operating conditions
along with the process and interconnect models.
• While there may be multiple modes and multiple corners, the timing
path delays for a given scenario are obtained deterministically
Process and Interconnect Variations
Global process
• The global process variations, which are also called inter-die
device variations, refer to the variations in the process
parameters which impact all devices on a die (or wafer) See
Figure
• This depicts that all devices on a die are impacted similarly by
these process variations- every device on a die will be slow or
fast or anywhere in between.
• Thus, the variations modeled by the global process
parameters are intended to capture the variations from die to
die
• For example, the parameter g_par1 may correspond to IDSsat
(device saturation current) for a standard1 NMOS device.
• Since this is a global parameter, all NMOS devices in all cell
instances of a die will correspond to the same value of g_par1
Local process
• The local process variations, which are also called intra-die device
variations, refer to the variations in the process parameters which can
affect the devices differently on a given die. See Figure 10-26.
• This implies that identical devices on a die placed side by side may have
different behavior on the same die.
• The variations modeled by the local process variations are in tended to
capture the random process variations within the die
• An illustration of the variations in a local process parameter is depicted
in Figure 10-27.
• The local parameter variations on a die do not track each other and
their variations from one cell instance to another cell instance are
uncorrelated.
• This means that a local parameter may have different values for
different devices on the same die.
• For example, different NAND2 cell instances on a die may see different
local process parameter values.
• This can cause different instances of the same NAND2 cell to have
different delay values even if other parameters such as input slew and
output loading are identical
Interconnect variations
• As described in previous section, there are various interconnect corners which represent the parameter variations of
each metal layer affecting the inter connect resistance and capacitance values.
• These parameter variations are generally the thicknesses of the metal and the dielectric, and the metal etch which
affects the width and spacing of the metal traces in various metal layers.
• In general, the parameters affecting a metal impact the parasitics of all traces in that metal layer but have minimal or no
effect on the parasitics of the traces in other metal layers
• The statistical approach models all possible combinations of variations in the interconnect space and thus models
variations which may not be captured by analyzing only at the specified inter connect corners.
• For example, it is possible that the launch path of a clock tree is in METAL2, whereas the capture path of the clock tree
is in METAL3.
• Timing analysis at the traditional interconnect corners considers various corners which vary all metals together and thus
cannot model the scenario where the METAL2 is at a corner which results in max delay, and the METAL3 is at a corner
which results in min delay.
• Such a combination corresponds to the worst-case scenario for the setup paths and can only be captured by modeling
the interconnect variations statistically
Statistical Analysis
• The modeling of variations described above is feasible if the cell timing models and the interconnect parasitics are modeled statistically.
Apart from delay, the pin capacitance values at the inputs of the cells are also modeled statistically.
• This implies that the timing models are described in terms of mean and standard deviations with respect to process parameters (global
and local).
• The interconnect resistances and capacitances are described in terms of mean and standard deviations with respect to interconnect
parameters.
• The delay calculation procedures obtain the delays of each timing arc (cell as well as interconnect) which are then represented by mean
and standard deviations with respect to various parameters.
• Thus, every delay is represented by a mean and N standard deviations (where N is the number of independent process and interconnect
parameters modeled statistically)
• For example, consider the path delay comprised from two timing arcs . Since each delay component has its variations, the variations are
combined differently depending upon whether these are correlated or uncorrelated. If the variations are from the same source (such as
caused by g_par1 which track each other), the s of the path delay is simply equal to (s1+ s2).
• However, if the variations are uncorrelated (such as due to l_par1), the s of the path delay is equal to sqrt((s1)sqr+ (s2)sqr), which is
smaller than (s1+ s2). The phenomenon of smaller s for the path delay when modeling local (uncorrelated) process variations is also
referred to as statistical cancellation of the individual delay variations .
• For a real design, both correlated as well as uncorrelated variations are modeled, and thus the contributions
from both of these types of variations need to be combined appropriately
• Assuming normal distribution, effective minimum and maximum values corresponding to (mean +/- 3s) can be
obtained. The (mean-/+ 3s) corresponds to 0.135% and 99.865% quantile values of the normal distribution
shown in Figure 10-30. The 0.135% quantile means that only 0.135% of the resulting distribution is smaller
than this value (mean- 3s)
• Similarly 99.865% quantile means that 99.865% of the distribution is smaller than this value or only 0.135%
(100%- 99.865%) of the distribution is larger than this (mean + 3s).
• The effective lower and upper bounds are referred to as the quantiles in an SSTA report and the designer can
select the quantile value used in the analysis, such as 0.5% or 99.5% which corresponds to (mean-/+ 2.576 s)
• Based upon the path slack distribution, the SSTA reports the mean, standard deviation and the quantile values
of slack for each path whereby the passing or failing can be determined based upon the required statistical
confidence
SSTA Result
• Based upon the path slack distribution, the SSTA reports the mean, standard deviation and the quantile values of slack
for each path whereby the passing or failing can be determined based upon the required statistical confidence

• The above report shows that while the mean of the timing path meets the requirement, the 0.135% quantile value has a violation
by 0.43ns - path slack quantile is-0.43ns.
• The path slack has a mean value of +0.86ns with 0.43ns standard deviation. This implies that +/- 2s of the distribution meets the
requirement.
• Since 95.5% of the distribution falls within 2s variation, this implies that only 2.275% of the manufactured parts will have a timing
violation (the remaining 2.275% of the distribution has large positive path slack).
• A 2.275% quantile setting will thus show a slack of 0 or no timing violation. The arrival time and the path slack distribution is
depicted in Figure .
Paths Failing Timing
• No Path Found :
If we are trying to obtain a path report and the STA reports that no path is found
or it provides a path report but the slack is infinite.
i. the timing path is broken, or
ii. the path does not exist, or
iii. there is a false path.
In each of these cases, careful debugging of the constraints is required to
identify what constraint causes the path to be blocked.
• Inverted Generated Clocks :
When creating generated clocks, the-invert option needs to be used carefully. If a
generated clock is specified using the-invert option, STA assumes that the
generated clock at the specified point is of the type specified.

create_clock-name CLKM-period 10-waveform {0 5} \ [get_ports CLKM]


create_generated_clock-name CLKGEN-divide_by 1-invert \-source [get_ports
CLKM] [get_pins UCKBUF0/C]
• create_clock-name CLKM-period 10-waveform {0 5} \ [get_ports CLKM]
• create_generated_clock-name CLKGEN-divide_by 1-invert \-source [get_ports
CLKM] [get_pins UCKBUF1/C]
• Missing Virtual Clock Latency :
When using virtual clocks, make sure latencies on virtual clocks are specified or are
accounted for in the set_input_delay and set_output_delay constraints.

• Large I/O Delays :


Wheninput or output paths have timing violations, the first thing to check
is the latency on the clock used as reference to specify the input arrival time
or the output required time. This is also applicable for the previous example.
• Incorrect I/O Buffer Delay :
When a pathgoesthrough an input oran output buffer, it is possible for an
incorrect specification to cause large delay values for the input or output
buffer delays. Notice the large output buffer delay of 18ns.
• Incorrect Latency Numbers :
Whenatiming path fails, one thing to check is if the latencies of the launch
clock and the capture clock are reasonable, that is, ensure that the skew between
these clocks is within acceptable limits.
• Half-cycle Path :
one needs to check the clock domains of the failing path. Along with this, one may
need to check the edges at which the launch and capture flip-flops are being
clocked.
Ensure that data path has sufficient time to propagate.
• Missing Multicycle Hold:
For a multicycle N setup specification, it is common to see the corresponding
multicycle N-1 hold specification missing. Consequently, this can cause a large
number of unnecessary delay cells to get inserted when a tool is fixing the hold
violations.
• Path Still Not Meeting Timing :
If the data path appears to have good strong cells and if the path is still failing
timing, one needs to examine the pins where the routing delay and wireload is
high.
• What if Timing Still Cannot be Met :
One can utilize useful skew to help close the timing. Useful skew is where one
purposely imbalances the clock trees, especially the launch and capture clock
paths of a failing path so that the timing passes on that path.
It typically means that the capture clock can be delayed so that the clock at the
capture flip-flop arrives later when the data is ready.
Validating Timing Constraints
• As chip size grows, there is more and more dependence on
signing off timing analysis.
• The risk of relaying only upon STA is dependent on how good
the timing constraints are.
• Checking Path Exceptions
• There are tools available that check the validity of false paths
and multicycle paths.
• These tools may also be able to generate missing false path
and multicycle path specification based upon the structure of
the design.
• However, some of the path exceptions generated by the tools
may not be valid.
• Typically using formal verification techniques, whereas a
designer has a more in-depth knowledge of the functional
behavior of the design.
• The exceptions generated by the tools are reviewed by the
designer before using in the STA.
• Based on semantic behavior of the design, there may be
additional exceptions which are need to be defined by the
designer.
• Checking Clock Domain Crossing
• Tools are available to ensure that all clock domain crossings in
a design are valid.
• These tools may also have the capability to automatically
generate the necessary false path specifications.
• Such tools may also be able to identify illegal clock domain
crossing.
• In such cases, the tools may provide the capability to
automatically insert suitable clock synchronization logic where
required.
• An alternate way of checking asynchronous clock crossings
using STA is to set a large clock uncertainty that is equal to the
period of the sampling clock.
• Validating IO and Clock Constraints
• Validating IO and clock constraints are still a challenge. Quite
often timing simulations are performed to check the validity of
all clocks in the design.
• System timing simulations are performed to validate the IO
timing to ensure that the chip can communicate with its
peripherals without any timing issues.

You might also like