STA Chapter-10 by Jay Bhaskar
STA Chapter-10 by Jay Bhaskar
• With these derating values, we get the following for setup check:
LaunchClockPath = 2.0 * 1.2 = 2.4
MaxDataPath = 5.2 * 1.2 = 6.24
CaptureClockPath = 2.06 * 0.9 = 1.854
Tsetup_UFF1 = 0.35 * 1.1 = 0.385
• The setup data check implies that SCTRL should arrive at least 2.1ns prior to the edge
of the related pin SDA. Otherwise it is a data to data setup check violation.
• The hold data check specifies that SCTRL should arrive at least 1.5ns after SDA. If the
constrained signal arrives earlier than this specification, then it is a data to data hold
check violation
• Consider the and cell shown in Figure . We
assume the requirement is to ensure that
PNA arrives 1.8ns before the rising edge of
PREAD and that it should not change for
1.0ns after the rising edge of PREAD.
• In this example, PNA is the constrained pin
and PREAD is the related pin. The required
waveforms are shown in Figure
• Such a requirement can be specified using
a data to data setup and hold check.
• set_data_check-from UAND0/A1-to
UAND0/A2-setup 1.8
• set_data_check-from UAND0/A1-to
UAND0/A2-hold 1.0
• The zero-cycle setup check causes the • In some scenarios, a designer
hold timing check to be different may require the data to data
from other hold check reports- the hold check to be performed
hold check is no longer on the same on the same clock cycle
clock edge. • set_multicycle_path -1 -hold-
to UAND0/A2
Non-Sequential Checks
• These checks are applied only to pins within a single cell or macro
• A non sequential check is a check between two pins, neither of which is a clock. One pin is the
constrained pin that acts like data, while the second pin is the related pin and this acts like a clock.
• The check specifies how long the data on the constrained pin must be stable before and after the
change on the related pin this check is specified as part of the cell library specification and no explicit
data to data check constraint is required.
• Difference between the non sequential checks and data to data checks :
• The clock gating thus ensures that the clock pin of the flip-flop toggles only
when new data is available at its data input.
• Power Gating
• Power gating involves gating off the power supply so that the power to the
inactive blocks can be turned off.
• This procedure is illustrated as a footer (or a header) MOS device is added in
series with the power supply.
• The control signal SLEEP is configured so that the footer (or a header) MOS
device is on during normal operation of the block.
• During inactive (or sleep) mode of the block, the gating MOS device is turned off
which eliminates any active power dissipation in the logic block.
• The footer and header devices introduce a series on resistance to the
power supply. If the value of the on resistance is not small, the IR drop
through the gating MOS device can affect the timing of the cells in the
logic block.
• Multi Vt Cells
• The multi Vt cells are used to tradeoff speed with leakage.
• high Vt cells < standard Vt cells < low Vt cells – leakage
• high Vt cells < standard Vt cells < low Vt cells – speed
• In most designs, the goal is to minimize the total power while achieving
the desired operational speed.
• Implementing a design with only high Vt cells to reduce leakage can
increase the total power even though the leakage contribution may be
reduced.
• High Performance Block with High Activity
• High Performance Block with Low Activity
• Well Bias
• The well bias refers to adding a small voltage bias to the P-well or N-
well used for the NMOS and PMOS devices respectively.
• The leakage power can be reduced significantly if the well
connections have a slight negative bias.
• This means that the P-well for the NMOS devices is connected to a
small negative voltage (such as -0.5V). Similarly, the N-well
connection for the PMOS devices is connected to a voltage above
the power rail (such as Vdd + 0.5V).
• By adding a well bias, the speed of the cell is impacted; however the
leakage is reduced substantially.
• The drawback of using well bias is that it requires additional supply
levels (such as -0.5V and Vdd+0.5V) for the P-well and N-well
connections.
Sign-off Methodology
STA can be run for many different scenarios. The three main variables that
determine a scenario are:
• Parasitics corners
• Operating mode
• PVT corner
Parasitics Interconnect Corners:
Parasitics can be extracted at many corners. These are mostly governed by
the variations in the metal width and metal etch in the manufacturing
process. Some of these are:
• Typical: This refers to the nominal values for interconnect resistance and
capacitance.
• Max C: This refers to the interconnect corner which results in maximum
capacitance. The interconnect resistance is smaller than at typical corner. This corner
results in largest delay for paths with short nets and can be used for max path
analysis.
• Min C: This refers to the interconnect corner which results in minimum capacitance.
The interconnect resistance is larger than at typical corner. This corner results in
smallest delay for paths with short nets and can be used for min path analysis.
• Max RC: This refers to the interconnect corner which maximizes the interconnect RC
product. This typically corresponds to larger etch which reduces the trace width. This
results in largest resistance but corresponds to smaller than typical capacitance. Over
all, this corner has the largest delay for paths with long interconnects and can be used
for max path analysis.
• Min RC: This refers to the interconnect corner which minimizes the interconnect RC
product. This typically corresponds to smaller etch which increases the trace width.
This results in smallest resistance but corresponds to larger than typical capacitance.
Overall, this corner has the smallest path delay for paths with long interconnects and
can be used for min path analysis.
Operating Modes
The operating mode dictates the operation of the design. Various
operating modes for a design can be:
• Functional mode 1 (for e.g. high-speed clocks)
• Functional mode 2 (for e.g. slow clocks)
• Functional mode 3 (for e.g. sleep mode)
• Functional mode 4 (for e.g. debug mode)
• Testmode1(for e.g. scan capture mode)
• Testmode2(for e.g. scan shift mode)
• Testmode3(for e.g. bist mode)
• Testmode4(for e.g. jtag mode)
PVT Corners
The PVT corners dictate at what conditions the STA analysis takes place.
The most common PVT corners are:
• WCS(slow process, low power supply, high temperature)
• BCF(fast process, high power supply, low temperature)
• Typical (typical process, nominal power supply, nominal temperature)
• WCL(worst-case slow at cold- slow process, low power supply, low
temperature)
Multi-mode Multi-Corner Analysis
• Multi-mode multi-corner (MMMC) analysis
refers to performing STA across multiple
operating modes, PVT corners and parasitic
interconnect corners at the same time.
• For example, consider a DUA that has four
operating modes (Normal, Sleep, Scan shift,
Jtag), and is being analyzed at three PVT
corners (WCS, BCF, WCL) and three parasitic
interconnect corners (Typical, Min C, Min
RC).
• There are a total of thirty six possible scenarios at which all timing
checks, such as setup, hold, slew, and clock gating checks can be
performed.
• Running STA for all thirty six scenarios at the same time can be
prohibitive in terms of runtime depending upon the size of the
design.
• It is possible that a scenario may not be necessary as it may be
included within another scenario, or a scenario may not be required.
• Also, it may not be necessary to run all modes in one corner, such as
Scan shift or Jtag modes may not be needed in scenario 5.
Advantages of MMMC
• The advantage of running multi-mode multi-corner STA is of savings in
runtime and complexity in setting up the analysis scripts.
• Additional savings in an MMMC scenario is that the design and
parasitics need to be loaded only once or twice as opposed to loading
these individually multiple times for each mode or corner.
• Multi-mode multi-corner has a bigger advantage in an optimization flow
where the optimization is done across all scenarios such that fixing
timing violations in one scenario does not introduce timing violations in
another scenario.
Statistical STA
• The static timing analysis techniques described thus far are
deterministic since the analysis is based upon fixed delays for all timing
arcs in the design.
• The delay of each arc is computed based upon the operating conditions
along with the process and interconnect models.
• While there may be multiple modes and multiple corners, the timing
path delays for a given scenario are obtained deterministically
Process and Interconnect Variations
Global process
• The global process variations, which are also called inter-die
device variations, refer to the variations in the process
parameters which impact all devices on a die (or wafer) See
Figure
• This depicts that all devices on a die are impacted similarly by
these process variations- every device on a die will be slow or
fast or anywhere in between.
• Thus, the variations modeled by the global process
parameters are intended to capture the variations from die to
die
• For example, the parameter g_par1 may correspond to IDSsat
(device saturation current) for a standard1 NMOS device.
• Since this is a global parameter, all NMOS devices in all cell
instances of a die will correspond to the same value of g_par1
Local process
• The local process variations, which are also called intra-die device
variations, refer to the variations in the process parameters which can
affect the devices differently on a given die. See Figure 10-26.
• This implies that identical devices on a die placed side by side may have
different behavior on the same die.
• The variations modeled by the local process variations are in tended to
capture the random process variations within the die
• An illustration of the variations in a local process parameter is depicted
in Figure 10-27.
• The local parameter variations on a die do not track each other and
their variations from one cell instance to another cell instance are
uncorrelated.
• This means that a local parameter may have different values for
different devices on the same die.
• For example, different NAND2 cell instances on a die may see different
local process parameter values.
• This can cause different instances of the same NAND2 cell to have
different delay values even if other parameters such as input slew and
output loading are identical
Interconnect variations
• As described in previous section, there are various interconnect corners which represent the parameter variations of
each metal layer affecting the inter connect resistance and capacitance values.
• These parameter variations are generally the thicknesses of the metal and the dielectric, and the metal etch which
affects the width and spacing of the metal traces in various metal layers.
• In general, the parameters affecting a metal impact the parasitics of all traces in that metal layer but have minimal or no
effect on the parasitics of the traces in other metal layers
• The statistical approach models all possible combinations of variations in the interconnect space and thus models
variations which may not be captured by analyzing only at the specified inter connect corners.
• For example, it is possible that the launch path of a clock tree is in METAL2, whereas the capture path of the clock tree
is in METAL3.
• Timing analysis at the traditional interconnect corners considers various corners which vary all metals together and thus
cannot model the scenario where the METAL2 is at a corner which results in max delay, and the METAL3 is at a corner
which results in min delay.
• Such a combination corresponds to the worst-case scenario for the setup paths and can only be captured by modeling
the interconnect variations statistically
Statistical Analysis
• The modeling of variations described above is feasible if the cell timing models and the interconnect parasitics are modeled statistically.
Apart from delay, the pin capacitance values at the inputs of the cells are also modeled statistically.
• This implies that the timing models are described in terms of mean and standard deviations with respect to process parameters (global
and local).
• The interconnect resistances and capacitances are described in terms of mean and standard deviations with respect to interconnect
parameters.
• The delay calculation procedures obtain the delays of each timing arc (cell as well as interconnect) which are then represented by mean
and standard deviations with respect to various parameters.
• Thus, every delay is represented by a mean and N standard deviations (where N is the number of independent process and interconnect
parameters modeled statistically)
• For example, consider the path delay comprised from two timing arcs . Since each delay component has its variations, the variations are
combined differently depending upon whether these are correlated or uncorrelated. If the variations are from the same source (such as
caused by g_par1 which track each other), the s of the path delay is simply equal to (s1+ s2).
• However, if the variations are uncorrelated (such as due to l_par1), the s of the path delay is equal to sqrt((s1)sqr+ (s2)sqr), which is
smaller than (s1+ s2). The phenomenon of smaller s for the path delay when modeling local (uncorrelated) process variations is also
referred to as statistical cancellation of the individual delay variations .
• For a real design, both correlated as well as uncorrelated variations are modeled, and thus the contributions
from both of these types of variations need to be combined appropriately
• Assuming normal distribution, effective minimum and maximum values corresponding to (mean +/- 3s) can be
obtained. The (mean-/+ 3s) corresponds to 0.135% and 99.865% quantile values of the normal distribution
shown in Figure 10-30. The 0.135% quantile means that only 0.135% of the resulting distribution is smaller
than this value (mean- 3s)
• Similarly 99.865% quantile means that 99.865% of the distribution is smaller than this value or only 0.135%
(100%- 99.865%) of the distribution is larger than this (mean + 3s).
• The effective lower and upper bounds are referred to as the quantiles in an SSTA report and the designer can
select the quantile value used in the analysis, such as 0.5% or 99.5% which corresponds to (mean-/+ 2.576 s)
• Based upon the path slack distribution, the SSTA reports the mean, standard deviation and the quantile values
of slack for each path whereby the passing or failing can be determined based upon the required statistical
confidence
SSTA Result
• Based upon the path slack distribution, the SSTA reports the mean, standard deviation and the quantile values of slack
for each path whereby the passing or failing can be determined based upon the required statistical confidence
• The above report shows that while the mean of the timing path meets the requirement, the 0.135% quantile value has a violation
by 0.43ns - path slack quantile is-0.43ns.
• The path slack has a mean value of +0.86ns with 0.43ns standard deviation. This implies that +/- 2s of the distribution meets the
requirement.
• Since 95.5% of the distribution falls within 2s variation, this implies that only 2.275% of the manufactured parts will have a timing
violation (the remaining 2.275% of the distribution has large positive path slack).
• A 2.275% quantile setting will thus show a slack of 0 or no timing violation. The arrival time and the path slack distribution is
depicted in Figure .
Paths Failing Timing
• No Path Found :
If we are trying to obtain a path report and the STA reports that no path is found
or it provides a path report but the slack is infinite.
i. the timing path is broken, or
ii. the path does not exist, or
iii. there is a false path.
In each of these cases, careful debugging of the constraints is required to
identify what constraint causes the path to be blocked.
• Inverted Generated Clocks :
When creating generated clocks, the-invert option needs to be used carefully. If a
generated clock is specified using the-invert option, STA assumes that the
generated clock at the specified point is of the type specified.