0% found this document useful (0 votes)
4 views

A_novel_scan_segmentation_design_method_for_avoiding_shift_timing_failure_in_scan_testing

The document presents a novel scan segmentation design method called LCTI-SS, aimed at reducing shift timing failures in scan testing for deep-submicron VLSI circuits by minimizing excessive switching activity around clock paths. This method optimizes the grouping of scan segments to effectively lower instantaneous switching activity while maintaining average power reduction benefits of conventional segmentation. Experimental results demonstrate the effectiveness of LCTI-SS in improving shift safety and reducing yield loss due to IR-drop-induced delay increases.

Uploaded by

Harsh Gupta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

A_novel_scan_segmentation_design_method_for_avoiding_shift_timing_failure_in_scan_testing

The document presents a novel scan segmentation design method called LCTI-SS, aimed at reducing shift timing failures in scan testing for deep-submicron VLSI circuits by minimizing excessive switching activity around clock paths. This method optimizes the grouping of scan segments to effectively lower instantaneous switching activity while maintaining average power reduction benefits of conventional segmentation. Experimental results demonstrate the effectiveness of LCTI-SS in improving shift safety and reducing yield loss due to IR-drop-induced delay increases.

Uploaded by

Harsh Gupta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

A Novel Scan Segmentation Design Method

for Avoiding Shift Timing Failure in Scan Testing


Yuta Yamato 1, Xiaoqing Wen 2, Michael A. Kochte 2,3, Kohei Miyase 2, Seiji Kajihara 2, and Laung-Terng Wang 4
1
Fukuoka Industry Science Technology Foundation, Fukuoka, Japan
2
Kyushu Institute of Technology, Iizuka, Japan
3
University of Stuttgart, Stuttgart, Germany
4
SynTest Technologies, Inc, Sunnyvale, CA, USA

Abstract Shift Mode Capture Mode


High power consumption in scan testing can cause undue
Launch Capture
yield loss which has increasingly become a serious T
problem for deep-submicron VLSI circuits. Growing Test
SE
evidence attributes this problem to shift timing failures, Cycle
which are primarily caused by excessive switching activity
in the proximities of clock paths that tends to introduce S1 SL C1 C2
CLK
severe clock skew due to IR-drop-induced delay increase.
This paper is the first of its kind to address this critical Shift Launch
issue with a novel layout-aware scheme based on scan Overheating Switching Switching
Activity Activity
segmentation design, called LCTI-SS (Low-Clock-Tree- (SSA) (LSA)
Impact Scan Segmentation). An optimal combination of
scan segments is identified for simultaneous clocking so IR-Drop-Induced Delay Increase
that the switching activity in the proximities of clock trees
is reduced while maintaining the average power reduction Shift Timing Failure Capture Timing Failure
effect on conventional scan segmentation. Experimental
results on benchmark and industrial circuits have
demonstrated the advantage of the LCTI-SS scheme. Shift Safety Test Power Safety Launch Safety
Keywords: scan testing, shift power reduction, scan Fig. 1 Test power safety issues.
segmentation, switching activity, clock tree, clock skew.
1.1 Test Power Safety in At-Speed Scan Testing
1. Introduction
At-speed scan testing is indispensable for DSM VLSI
Scan design is the most widely used design-for-testability circuits. However, its power dissipation, i.e., test power, is
(DFT) technique [1]. It provides external access to the flip- increasingly causing various problems, threatening its test
flops (FFs) in a design by replacing FFs with scan cells power safety. The reasons are illustrated in Fig. 1 and
and stitching them into one or more shift registers called described as follows:
scan chains. As a result, scan design has made it possible
to test sequential circuits with reduced complexity and in In shift mode, the accumulative impact of excessive shift
practical time. In recent years, at-speed scan testing, switching activity (SSA) may cause overheating of dies or
which is realized by launching a transition and capturing chip packages due to excessively increased average power
its response at the system speed, has become mandatory in dissipation. This is because most of the test application
order to guarantee sufficient quality levels for deep- time is spent in shift mode, especially for circuits with
submicron (DSM) VLSI circuits. This is because timing- long scan chains. At the same time, the instantaneous
related defects have become dominant in such circuits. impact of excessive SSA may cause IR-drop-induced
In practice, at-speed scan testing is usually realized by the delay increase along scan paths as well as clock paths,
launch-on-capture (LOC) scheme since it has lower which ends up with shift timing failures such as setup or
physical design complexity for the scan enable signal than hold time violations and thus yield loss [3, 4]. On the other
other clocking schemes [2]. The basic scheme of LOC is hand, in capture mode, the instantaneous impact of
shown in Fig. 1. In shift mode (SE = 1), a test vector is excessive launch switching activity (LSA) at the launch
applied by operating scan chains as shift registers with cycle C1 may cause excessive IR-drop-induced delay
multiple shift clock pulses (S1 to SL). Then, in capture increase along sensitized paths, leading to capture timing
mode (SE = 0), two capture pulses C1 and C2 are applied failures at the capture cycle C2 and thus yield loss. The
for launching a transition at the start-point of a path and reasons are that the test cycle T is extremely short for
capturing the circuit response to the launched transition at high-speed circuits, and that low-power circuits are more
the end-point of the path. For at-speed scan testing, the test susceptible to changes in power supply voltages [5-9].
cycle T should be made equal to the functional clock cycle, Therefore, test power safety, the combination of both shift
which is extremely short for a high-speed design. safety and launch safety, should be guaranteed for at-

Paper 12.1 INTERNATIONAL TEST CONFERENCE 1


978-1-4577-0152-8/11/$26.00 ©2011 IEEE

Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY BOMBAY. Downloaded on February 05,2025 at 04:10:04 UTC from IEEE Xplore. Restrictions apply.
speed scan testing in order to avoid chip/package damage,
undue yield loss, and reliability degradation [7].
1.2 Previous Solutions for Test Power Safety
Generally, test power safety needs to be achieved by
properly reducing both SSA and LSA, as illustrated in Fig.
1. Previous solutions for reducing LSA and SSA are based
on either circuit modification or test data manipulation [6,
7]. Generally, it is preferable to reduce LSA by test data
manipulation since this approach causes no adverse impact
on ATPG, circuit design, and performance. Several
effective test-data-manipulation-based techniques [10-13]
exist for reducing LSA, which are helpful in achieving
launch safety. On the other hand, it is preferable to reduce
SSA by circuit modification since SSA often needs to be
significantly and predictably reduced to meet the heat
management requirement of packaging. Furthermore,
circuit modification in shift mode causes neither ATPG
change nor fault coverage loss. Several circuit-
modification-based approaches for reducing SSA have
been proposed so far, as summarized below:
Scan clock gating [14, 15] searches for test patterns that
do not detect any new faults during BIST, and disable scan Fig. 2 Conventional scan segmentation.
FFs while these redundant patterns are applied. Obviously,
the SSA reduction effect of this approach is highly 1.3 Shift Timing Failures
dependent on the redundant pattern count. Scan chain Conventional scan segmentation can effectively and
disabling reduces the number of active scan chains [16] predictably reduce the accumulative impact of excessive
during shift and capture. This approach can also be applied SSA, thus effectively solving the overheat problem due to
with power-aware test planning for BIST [17] to reduce average SSA. However, it is unable to mitigate the
average SSA significantly. Toggle suppression [14] instantaneous impact of excessive SSA. As a result, IR-
inserts blocking logic to the outputs of scan FFs, thereby drop-induced delay increase may still occur along clock
significantly reducing the average SSA in the paths from a clock pin to scan FFs, which may lead to shift
combinational portion. However, circuit performance timing failures and thus severely damaging shift safety.
degradation may occur due to the insertion of blocking This problem is illustrated in Fig. 3.
logic into functional paths. Scan cell ordering [18] tries to
find a proper order of scan FFs for a given test set, but its S11 S12 SC1
SSA reduction effect is highly test-set-dependent.
Compared with the above approaches, scan segmentation
[19-22] is a more preferable approach for reducing SSA.
GCLK1 High GCLK2
Fig. 2 illustrates the structure of conventional scan (Active) SSA (Inactive)
segmentation [19]. The basic idea is to split a scan chain
into multiple segments, and make sure to shift just one
segment of the scan chain at a time while keeping all other
S21 S22 SC2
segments deactivated. In Fig. 2, the original scan chain
with length L (Fig. 2 (a)) is split into 3 shorter segments High SSA around the active clock paths
with length L/3 (Fig. 2 (b)). The shift operation is may cause shift timing failures.
conducted for segments S1, S2, and S3, one by one. The
Fig. 3 Problem of conventional scan segmentation.
currently inactive segments are silenced by gating their
scan clocks. The most significant benefit of segmentation In Fig. 3, scan chains SC1 and SC2 are split into two scan
scan is that average SSA can be effectively and segments {S11, S12} and {S21, S22} respectively. The shift
predictably reduced since it limits the number of scan FFs operation is conducted by clocking S11 and S21 together the
where transitions occur simultaneously. In addition, scan first, followed by clocking S12 and S22 together the next.
segmentation causes no performance degradation since it Although this scheme can reduce the global (whole-
inserts no additional logic to functional paths. Furthermore, circuit) average SSA by approximately 50%, the SSA
the SSA reduction effect of scan segmentation is around clock paths to the active segments may still be high.
independent of the given test set, which can be easily That is, IR-drop-induced delay increase may still occur
generated by conventional ATPG. along the clock paths, resulting in severe clock skew. As a

Paper 12.1 INTERNATIONAL TEST CONFERENCE 2

Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY BOMBAY. Downloaded on February 05,2025 at 04:10:04 UTC from IEEE Xplore. Restrictions apply.
result, shift timing failures may occur at scan FFs, segments, resulting 9 segments S11 to S33. Three gated
resulting in undue yield loss. clocks GCLK1, GCLK2, and GCLK3 are connected to all
The above discussions clearly points to a new problem that scan FFs in 3 segment groups G1 = {S11, S21, S31}, G2 =
threatens shift safety, i.e., excessive SSA around clock {S12, S22, S32}, and G3 = {S13, S23, S33}, respectively.
paths. In the context of scan segmentation, this problem Similar to scan segmentation for a single scan chain, the
shift operation is conducted for G1, G2, and G3, one at a
translates into that reducing only global average SSA by
conventional scan segmentation cannot guarantee shift time. As shown in Fig. 4 (b), gated clocks GCLK1, GCLK2,
safety. Therefore, there is a strong need to effectively and GCLK3 are exclusively enabled during a shift
operation, Note that the test response to a test vector is
reduce SSA around clock paths in scan segmentation.
captured by enabling all gated clock signals after a test
1.4 Contribution and Paper Organization vector has been shifted into all segments. Since the
number of simultaneously-switching FFs becomes smaller,
This paper addresses the new shift safety problem caused
global average SSA is effectively reduced. Note that no
by excessive SSA around clock paths in conventional scan
modification is required on functional paths, thus avoiding
segmentation. The basic idea is to optimize the
any performance degradation. In addition, test application
combination of scan segments for simultaneous clocking
time remains the same as that of the standard scan
since SSA depends on which segments are simultaneously
architecture. Generally, the average shift power reduction
clocked. For example, conventional scan segmentation
ratio is approximately 50% for a 2-segment configuration
shown in Fig. 3 uses segment groups {S11, S21} and {S12,
and 66% for a 3-segment configuration [7].
S22}. However, SSA around clock paths may be potentially
reduced by using a different segment grouping, e.g., {S11,
S22} and {S12, S21}. Therefore, we propose a new scan
segmentation scheme in which segment grouping is
optimized for SSA reduction around clock paths.
The major contribution of this paper is to propose a novel
layout-aware scan segmentation clocking scheme, called
LCTI-SS (Low-Clock-Tree-Impact Scan Segmentation).
LCTI-SS deals with the real cause of excessive-SSA-
induced yield loss by reducing SSA in proximities of
active clock paths (called impact areas) while preserving
the benefits of conventional scan segmentation in reducing
average whole-circuit shift power. A sophisticated
segment regrouping algorithm is devised to directly reduce
SSA in impact areas by optimizing the grouping of scan
segments for simultaneous clocking. LCTI-SS improves
shift safety since the reduction of instantaneous SSA is
directly focused on impact areas to significantly reduce
IR-drop-induced shift timing failures. To our best
knowledge, this paper is the first of its kind to mitigate the
impact of shift switching activity on clock paths.
The rest of this paper is organized as follows: Section 2
reviews conventional scan segmentation, Section 3
presents the proposed LCTI-SS scheme, Section 4 and
Section 5 present the details of impact area identification
and segment regrouping, respectively, Section 6 shows
experimental results, and Section 7 concludes the paper.

2. Background
This section first describes the details of conventional scan Fig. 4 Conventional multi-scan segmentation.
segmentation for circuits with multiple scan chains. It then
reviews previous clocking schemes proposed for reducing The proposed LCTI-SS scheme is especially suitable for
shift power in such circuits. such multi-scan circuits. This is because in a multi-scan
segmentation design, multiple segments are simultaneously
2.1 Conventional Multi-Scan Segmentation clocked and there exists a possibility of selecting an
optimal group of segments for simultaneous clocking so
Most of scan circuits contain multiple scan chains. Fig. 4 that the impact of SSA on clock paths is reduced.
shows a conventional scan segmentation design for a
circuit with 3 scan chains. Each scan chain is split into 3 2.2 Previous Low-Shift-Power Clocking Schemes

Paper 12.1 INTERNATIONAL TEST CONFERENCE 3

Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY BOMBAY. Downloaded on February 05,2025 at 04:10:04 UTC from IEEE Xplore. Restrictions apply.
The number of simultaneously-switching FFs can be to identify nodes (gates and FFs) whose transitions have
reduced by manipulating shift clocks. In staggered significant impact on IR-drop-induced delay increase on
clocking [20] as shown in Fig. 5 (a), the shift clock edges clock paths. After that, segment regrouping () is
are skewed by staggering clocks. In MD-SCAN [23] as conducted to minimize the number of nodes in impact
shown in Fig. 5(b), the shift clock edges are skewed by areas which may affect active clock paths. As a result,
introducing multiple clock duty cycles with different netlist N’’, layout L’, and clock tree C’ are obtained by
lengths. Both clocking schemes can reduce the number of reconnecting gated clocks to corresponding segments. An
simultaneously-switching FFs. Obviously, this results in alternative to clock tree modification is to use a
lower global average SSA. programmable clock control [16, 25].

N: Netlist

Conventional Scan Segmentation Design

Nʹ: Netlist Place & Route

L: Layout C: Clock Tree


1
Impact Area Identification

Impact Area

2
Segment Regrouping

Segment Groups

Layout Modification
Fig. 5 Clocking schemes for shift power reduction.

However, shift timing failures may still occur in Nʹʹ: Netlist Lʹ: Layout Cʹ: Clock Tree
conventional scan segmentation even when these clocking
schemes are employed. As described in Subsection 1.3, the Fig. 6 General flow of the LCTI-SS scheme.
reason is that excessive IR-drop around clock paths may
SI1
cause severe clock skew in clock paths, resulting in hold
time violations in FFs [3, 4], which cannot be avoided by S11 S12 S13
simply lowering the clock frequency [25].
SI2 SO1
3. The LCTI-SS Scheme
S21 S22 S23
This section describes Low-Clock-Tree-Impact Scan
Segmentation (LCTI-SS), for reducing the instantaneous SI3 SO2
shift switching activity (SSA) in the proximities of clock
trees in shift mode so as to reduce the risk of timing S31 S32 S33
failures in shift chains. Together with the intrinsic benefit
of scan segmentation for reducing global average SSA to SO3
Original
mitigate the overheat problem, the proposed LCTI-SS Clock
Tree
significantly improves shift safety.
Fig. 6 shows the general flow of the proposed LCTI-SS
scheme. It consists of two major steps: impact area
identification ( ○
1 ) and segment regrouping ( ○2 ), as
described below: GCLK1
GCLK2
Given a circuit netlist N with standard full-scan design,
GCLK3
conventional scan segmentation (as illustrated in Fig. 4) is
first designed. The result is a new netlist N’, for which Fig. 7 Example of segment regrouping.
place-and-route is conducted to produce a layout design L To illustrate the LCTI-SS scheme, let us revisit the case
and a clock tree design C. Based on these two types of shown in Fig. 4. Here, the initial segment groups provided
information, impact area identification (○1 ) is conducted by conventional scan segmentation are G1 = {S11, S21, S31},

Paper 12.1 INTERNATIONAL TEST CONFERENCE 4

Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY BOMBAY. Downloaded on February 05,2025 at 04:10:04 UTC from IEEE Xplore. Restrictions apply.
G2 = {S12, S22, S32}, and G3 = {S13, S23, S33}. By applying Definition 4: Let RCAS(S) be a set of clock aggressors
the LCTI-SS scheme, scan segments are regrouped, for structurally reachable from all FFs in a segment S, and let
example, into G1’ = {S13, S22, S31}, G2’ = {S11, S21, S33}, and G be a segment group composed of segments S1, S2, ...,
G3’ = {S12, S23, S32}, as shown in Fig. 7. Gated clocks are and Sn to be clocked simultaneously. The impact
reconnected to each corresponding segment group while aggressor set of G, denoted by IAS(G), is defined as
most of the original clock tree design remains unchanged. n n

4. Impact Area Identification


IAS (G )  i 1
( IA( S i ))   ( RCAS (S ))
i 1
i

This section presents the details about impact area


identification, which is a critical step in LCTI-SS. Clearly, the impact aggressor set of G contains only clock
aggressors that may affect active clock paths, i.e., clock
Definition 1: The clock aggressor set of a clock buffer B, aggressors satisfying both Condition A and Condition B.
denoted by CAS(B), is a set of nodes (gates and FFs) An example is shown in Fig. 10. Here, two scan segments
placed near B and sharing power rails with B. S11 and S21 are assumed to belong to G1. IA(S11) = {N1, N2,
Fig. 8 shows an example, where CAS(B) = {N6, N7, N10, N3, N5, N7, N6}, IA(S21) = {N4, N5, N6, N7, N8, N9},
N11, N14, N15} for the clock buffer B. RCAS(S11) = {N1, N2, N3, N5, N7}, and RCAS(S21) = {N3, N5,
N6, N8}. In this case, IAS(G1) = (IA(S11)  IA(S21)) 
(RCAS(S11)  RCAS(S21)) = {N1, N2, N3, N5, N6, N8}.
N1 N2 N3 N4 From above definitions, the impact aggressor set of a
N5 N6 N7 N8 segment group with arbitrary combinations of scan
VDD segments can be calculated. This information is used to
N9 N10 B N11 N12 estimate the risk of shift timing failures.
VSS
N13 N14 N15 N16
CAR(P1 )
IA( S11 )  CAR ( P1 )  CAR ( P2 )
N17 N18 N19
S 11
FF1
Fig. 8 Example of clock aggressor set.
B3
FF2
Definition 2: Let P be a path consisting of all clock CLK B1 B2
buffers {B1, B2, ..., Bm} from a gated clock pin to the clock B4
input of a scan FF. The clock aggressor region of P,
denoted by CAR(P), is defined as
m
CAR(P2 )
CAR( P )   (CAS ( Bi ))
i 1 Fig. 9 Example of clock aggressor region and impact area.

Definition 3: Let S be a scan segment consisting of scan


FFs {FF1, FF2, ..., FFn} and let Pi be a clock path to FFi (i
= 1, 2, …, n). The impact area of S, denoted by IA(S), is
defined as
n
IA( S )   (CAR(P ))
i 1
i

An example is shown in Fig. 9, where two scan FFs, FF1


and FF2, are assumed to form the scan segment S11. Here,
CAR(P1) = CAS(B1)  CAS(B2)  CAS(B3), CAR(P2) =
CAS(B1)  CAS(B2)  CAS(B4). As a result, IA(S11) =
CAR(P1)  CAR(P2).
Although each segment has an impact area, it does not Fig. 10 Example of impact aggressor set.
necessarily mean that all nodes (i.e., clock aggressors) in
the impact area may affect propagation delay of clock
5. Segment Regrouping
paths. Generally, a node impacting active clock buffers Generally, the number of impact aggressors depends on
needs to satisfy the following two conditions: the combination of segments to be simultaneously clocked.
The smaller the number of impact aggressors, the lower
Condition A: The node belongs to at least one impact area the probability of simultaneous transitions at impact
of active segments. aggressors. This indicates that it is possible to regroup
Condition B: The node is structurally reachable from at segments optimally so that each segment group has a
least one scan FF in active segments. smaller number of impact aggressors. This section presents

Paper 12.1 INTERNATIONAL TEST CONFERENCE 5

Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY BOMBAY. Downloaded on February 05,2025 at 04:10:04 UTC from IEEE Xplore. Restrictions apply.
an effective algorithm for segment regrouping, which is As shown in Fig. 11, in Phase 1 and Phase 2 of the
another critical step in the LCTI-SS scheme. algorithm, segments are selected one at a time and added
to a particular segment group. In Phase 1, the segment
The proposed algorithm for segment regrouping uses the which maximizes the weighted impact of IAS for group
weighted switching activity (WSA) metric for SSA Gtmp is selected. In Phase 2, the segment which results in
estimation since this metric has good correlation with
the minimum weighted impact of IAS of a particular group
power dissipation [5, 11] and IR-drop [26]. G is selected for addition to G.
Definition 5: Let IAS be an impact aggressor set. The
Algorithm: Segment_Regrouping{
weighted impact of IAS, denoted by WI(IAS), is defined as INPUT: netlist, impact area, initial segment groups
n OUTPUT: updated segment groups
WI ( IAS )   ( wi )
i 1 n = the number of groups;
for (i = 1 to n ) {
where n is the number of nodes in the impact aggressor set, Gi = ;
and wi is the weight of node i (i = 1, 2, …, n), which can }
be approximated by the number of its fanout branches.
// Phase 1:
Based on the above definitions, the problem of segment Gtmp = ;
regrouping can be formalized as follows: for (i = 1 to n ) {
foreach ( unselected segment S ) {
Segment Regrouping Problem: Given a scan compute WI(IAS( Gtmp  {S} ));
segmentation design with m scan chains and n segments }
for each scan chain, find n segment groups G1, G2, ..., Gn Smax = the segment with the maximum | IAS( Gtmp  {S} )|;
// Select Smax
such that the weighted impact of the impact aggressor set Gi = Gi  {Smax};
for each segment group Gi (i = 1, 2, …, n), namely Gtmp = Gtmp  {Smax};
WI(IAS(Gi)), is minimized. }

Theoretically, the total number of segment group // Phase 2:


combinations can be expressed by the following theorem: while ( not all segments are selected yet ) {
for (i = 1 to n ) {
Theorem 1: For a scan segmentation design with m scan foreach ( unselected segment S ) {
chains and n segments for each scan chain, the total if ( S shares same scanchain
with at least one segment in Gi ) {
number of segment group combinations is (n!)m. continue;
Proof: For the first segment group, n segments can be } else {
compute WI(IAS( Gi  {S} ));
selected from each of the m scan chains, which results in }
nm possible combinations. Then, repeating this until the n- }
th segment group result in (n-1)m possible combinations Smin = the segment with the minimum | IAS( Gi  {S} )|;
// Select Smin
for the second segment group, (n-2)m possible Gi = Gi  {Smin};
combinations for the third segment group, ..., and 1 }
combination for the n-th segment group. Therefore, the }
total number of segment group combinations is as follows: return { G1, G2, ..., Gn};
n1
}

 (n  k )
k 0
m
 (n!) m
Fig. 11 Segment regrouping algorithm.
Theorem 1 indicates that it is impractical to check all To find and select the segment with minimum or
possible segment group combinations to find the best one maximum WI, we compute the resulting WI for the
for large industrial circuits with a large number of scan considered group and all yet-unselected segments. Each
chains. Therefore, we propose a heuristic two-phase segment is selected exactly once and before the selection,
algorithm to efficiently find an optimal segment group WI is computed with respect to each yet-unselected
combination with low SSA at clock aggressors. segment. Thus, the number of WI computations is
NS
The proposed segment regrouping algorithm is shown in NS ( NS  1)
Fig. 11. In Phase 1, a segment group Gtmp with the i 
i 1
2
maximum weighted impact is identified, and segments in
Gtmp are placed into separate groups G1, G2, ..., Gn in order where NS is the total number of segments. To compute WI,
to divide the segments in the worst case segment group we use optimized set operations (union, intersection) on
into discrete groups. Then, in Phase 2, a segment Smin is the pre-computed sets of IA and RCAS to reduce runtime.
selected such that the union (Gi  Smin) has the minimum
weighted impact, and Smin is added to Gi. This process is 6. Experimental Results
repeated until all segments are selected. This algorithm The proposed LCTI-SS scheme was implemented in C
tries to reduce SSA at clock aggressors by minimizing the language for evaluation. Six large ITC’99 benchmark
weighted impact for each segment group. This way, the circuits (b17 to b22) [27] and one industrial circuit (ck1)
clock aggressors of this particular segment group in the were used in the experiments. Logic synthesis, layout, and
affected area can be reduced. transition delay ATPG were conducted by Design

Paper 12.1 INTERNATIONAL TEST CONFERENCE 6

Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY BOMBAY. Downloaded on February 05,2025 at 04:10:04 UTC from IEEE Xplore. Restrictions apply.
Table 2 Experimental Results
Compiler®, IC Compiler, and TetraMax® from Synopsys®,
respectively. Table 1 shows the profile of the circuits and #Seg-
Reduction (%)
CPU
corresponding test sets. The low testability of some of the Circuit #Chains WI WSA at IAS
ments (sec )
ITC’99 benchmark circuits causes low fault coverage Max. Ave. Max. Ave.
3 8.3 1.1 11.2 1.8 0
since no further test point insertion was conducted. 3 4 9.0 2.3 9.8 1.7 0
5 16.1 1.1 16.8 0.9 0
Table 1 Profile of Circuits and Test Sets 3 0.0 2.4 -5.9 -12.6 0
b17 5 4 10.1 4.5 -4.6 -1.2 0
# of # of Clock # of Test Fault 5 10.3 2.8 12.6 0.0 0.1
Circuit # of FFs 3 3.6 4.3 3.9 -1.0 0.1
Gates Aggressors Vectors Cov. (%)
10 4 11.0 5.1 1.2 -2.7 0.1
b17 37K 1317 5939 999 76.7 5 -6.9 1.5 12.1 0.8 0.1
b18 92K 3064 13068 2038 69.7 3 5.1 2.8 2.4 -0.7 0
10 4 3.3 3.5 0.2 -0.6 0
b19 174K 6130 28268 2763 71.7 5 7.9 4.6 -1.9 0.0 0.1
b20 19K 430 1841 1514 94.3 3 5.7 3.9 13.2 1.2 0.3
b21 19K 430 1811 1509 94.9 b18 30 4 10.0 4.1 10.5 1.6 0.6
5 10.0 3.9 9.2 0.4 0.9
b22 28K 645 2812 1913 95.0 3 0.2 0.4 -2.7 -1.4 1.4
ck1 2M 99759 282519 2257 97.4 50 4 0.3 2.3 1.4 1.9 2.6
5 6.9 4.0 2.1 1.7 4.1
We prepared various scan configurations with different 3 8.0 -0.5 3.2 5.6 0
numbers of scan chains and segments for each circuit. For 10 4 8.6 -1.3 -1.3 4.8 0.1
5 6.7 -1.7 3.0 5.4 0.1
b17, b20, b21, and b22, configurations with 3, 4, and 5 3 0.7 2.1 6.5 0.9 0.7
scan chains were used. For b18 and b19, configurations b19 30 4 5.9 2.9 10.6 0.6 1.2
with 10, 30, and 50 scan chains were used. For ck1, 5 8.2 5.0 4.8 1.3 1.9
3 0.8 1.6 4.2 1.1 3
configurations with 100, 200, and 300 scan chains were 50 4 0.3 2.8 1.8 0.1 5.1
used. After that, conventional scan segmentation with 3, 4, 5 6.3 3.1 0.3 0.3 8.2
3 2.1 3.7 1.6 7.5 0
and 5 segments were applied for each configuration. 3 4 1.9 -2.6 -2.9 1.7 0
5 3.4 -0.9 6.7 5.1 0
For evaluation, we used the WSA metric to estimate SSA. 3 -2.3 7.0 7.1 -2.1 0
A more precise evaluation, based on electrical or circuit- b20 5 4 9.6 5.8 7.2 -0.3 0
level simulation, is more accurate but computationally 5 11.4 7.0 1.2 2.4 0
3 2.4 7.1 -1.7 0.9 0
very expensive since hundreds to thousands of shift cycles 10 4 7.3 -1.3 7.5 -2.1 0
have to simulated for a single test vector alone. Since 5 10.1 3.7 1.3 4.1 0
3 17.4 6.2 7.3 -0.4 0
WSA has been shown to correlate well with IR-drop [26] 3 4 7.1 1.1 -14.7 -4.5 0
and thus IR-drop induced delay, we employed WSA in our 5 21.2 3.6 10.4 -4.8 0
experiments. We compared the proposed LCTI-SS scheme 3 4.3 0.4 6.8 0.3 0
b21 5 4 10.3 1.7 6.7 4.0 0
with conventional scan segmentation in terms of the 5 15.9 7.8 15.5 4.9 0
weighted impact WI and WSA at impact aggressor sets. 3 3.3 0.6 6.4 0.8 0
10 4 13.7 10.4 25.1 7.6 0
Table 2 summarizes the experimental results. The 5 8.5 7.7 4.8 6.2 0
reduction ratio of the maximum and the average weighted 3 3.8 0.2 -1.2 0.4 0
3 4 8.9 -0.4 15.0 0.7 0
impact (“WI”) and the maximum and the average WSA at 5 15.4 -0.4 18.3 -2.3 0
impact aggressor sets among segment groups (“WSA at 3 -1.4 -1.1 5.0 2.3 0
b22 5 4 4.7 2.9 18.2 1.8 0
IAS”) are shown in columns 4 to 7. CPU runtime for 5 7.8 4.9 7.1 1.3 0
segment regrouping (“CPU (sec)”) is shown in column 8. 3 -1.6 0.7 6.2 -2.2 0
It can be seen that, for over 80% of circuits and scan 10 4 11.1 0.4 13.7 -0.3 0.1
5 6.9 0.5 -0.9 -1.8 0.1
configurations used in the experiments, targeted maximum 3 12.2 8.9 0.5 3.1 52.3
WSA at impact aggressor sets were significantly reduced, 100 4 3.3 3.9 5.8 2.5 99.4
on average as much as over 10% compared with 5 8.5 5.9 -2.7 0.1 156.5
3 5.2 3.5 11.0 0.6 453.8
conventional scan segmentation. The maximum reduction ck1 200 4 5.1 1.4 10.4 2.0 840.4
exceeded 25% in the case of b21. In addition to the 5 12.1 3.1 10.3 2.3 1307.3
3 1.9 2.5 4.5 1.1 2112.4
reduction of maximum WSA, average WSA at impact 300 4 5.8 4.3 15.8 0.3 4046.6
aggressor sets was slightly reduced by 1.1% on average for 5 2.5 1.8 9.4 0.5 6423.4
all experimented configurations. Furthermore, the runtime Ave. 5.4 1.5 10.3 1.1
of the proposed algorithm was relatively short even for the Fig. 13 depicts the reduction ratio of the maximum
large industrial circuit with 2 million gates. This indicates
weighted impact and the reduction ratio of the maximum
the scalability of the proposed algorithm.
WSA at the impact aggressor set and their correlation for
Fig. 12 shows a more detailed analysis by plotting the all circuits and scan configurations used in the experiments.
maximum and average WSA at the clock aggressor set per It can be seen that with the increasing reduction ratio of
test vector for b21 for the configuration with 5 scan chains WI, the WSA reduction also tends to increase. There are a
and 5 segments in each scan chain. It can be seen that both few outliers, e.g., for the case of b21 with 3 scan chains
maximum and average WSA at the clock aggressor set and 4 segments in each scan chain. This indicates that
were effectively reduced for all test vectors. even though the weighted impact has a rather decent

Paper 12.1 INTERNATIONAL TEST CONFERENCE 7

Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY BOMBAY. Downloaded on February 05,2025 at 04:10:04 UTC from IEEE Xplore. Restrictions apply.
correlation with the WSA at impact aggressor sets, a more References
accurate metric for the segment regrouping algorithm may [1] L.-T. Wang, C.-W. Wu, and X. Wen, Editors, VLSI Test Principles
be needed to further improve the maximum WSA and Architectures: Design for Testability, San Francisco: Morgan
Kaufmann, 2006.
reduction at impact aggressor sets. [2] J. Saxena, et al., “Scan-Based Transition Fault Testing –
Implementation and Low Cost Test Challenges,” Proc. IEEE Intl.
Test Conf., pp. 1120-1129, 2002.
[3] Y. Huang, et al., “Statistical Diagnosis for Intermittent Scan Chain
Hold-Time Fault,” Proc. IEEE Intl. Test Conf., pp.319-328, 2003.
[4] Y. Wu, “Diagnosis of Scan Chain Failures,” Proc. IEEE Intl. Symp.
on Defect and Fault Tolerance, pp. 217-222, 1998.
[5] P. Girard, “Survey of Low-Power Testing of VLSI Circuits,” IEEE
Design & Test of Computers, Vol. 19, No. 3, pp. 82-92, Feb. 2002.
[6] J. Saxena, K.M. Butler, and L. Whetsel, “An Analysis of Power
Reduction Techniques in Scan Testing,” Proc. IEEE Intl. Test
Conf., pp. 670-677, 2001.
[7] P. Girard, N Nicolici, and X. Wen, Editors, Power-Aware Testing
and Test Strategies for Low Power Devices, Springer, 2009.
[8] C. P. Ravikumar, M. Hirech and X. Wen, “Test Strategies for Low-
Power Devices,” J. of Low Power Electronics, Vol. 4, No.2, pp.
127-138, Aug. 2008.
[9] J. Saxena, et al., “A Case Study of IR-Drop in Structured At-Speed
Fig. 12 WSA plot for b21. Testing,” Proc. IEEE Intl. Test Conf., pp. 1098-1104, 2003.
[10] X. Wen, et al., “On Low-Capture-Power Test Generation for Scan
30.0
Testing,” Proc. IEEE VLSI Test Symp., pp. 265-270, 2005.
25.0 [11] S. Remersaro, et al., “Preferred Fill: A Scalable Method to Reduce
Capture Power for Scan Based Designs,” Proc. IEEE Intl. Test
Conf., Paper 32.2, 2006.
Reduction Ratio of Max. WSA at IAS (%)

20.0
[12] K. Enokimoto, et al., “CAT: A Critical-Area-Targeted Test Set
15.0 Modification Scheme for Reducing Launch Switching Activity in
At-Speed Scan Testing,” Proc. IEEE Asian Test Symp., pp. 99-104,
10.0 2009.
5.0
[13] Y. Yamato et al., “A GA-Based Method for High-Quality X-
Filling to Reduce Launch Switching Activity in At-Speed Scan
0.0 Testing,” Proc. IEEE Pacific Rim Intl. Symp. on Dependable
Computing, pp. 81-86, 2009.
-10.0 0.0 10.0 20.0 30.0
-5.0 [14] S. Gerstendorfer and H. -J. Wunderlich, “Minimized Power
Consumption for Scan-Based BIST,” Proc. IEEE Intl. Test Conf.,
-10.0 pp. 77-84, 1999.
[15] P. Girard, et al., “A Test Vector Inhibiting Technique for Low
-15.0 Energy BIST Design,” Proc. IEEE VLSI Test Symp., pp. 407-412,
1999.
-20.0
Reduction Ratio of Max. WI (%) [16] R. Sankaralingam and N. A. Touba, “Reducing Test Power During
Test Using Programmable Scan Chain Disable,” Proc. Intl.
Workshop on Electronic Design, Test and Applications, pp. 159-
Fig. 13 MAX. WI reduction vs. MAX. WSA reduction at IAS. 163, 2002.
[17] M.E. Imhof, et al., “Scan Test Planning for Power Reduction,”
7. Conclusions Proc. Design Automation Conf., pp. 521-526, 2007.
[18] Y. Bonhomme, et al., “Efficient Scan Chain Design for Power
This paper is the first of its kind to address the problem of Minimization during Scan Testing under Routing Constraint,”
IR-drop-induced shift timing failures by a novel layout- Proc. IEEE Intl. Test Conf., pp. 488-493, 2003.
aware scan segmentation scheme, namely Low-Clock- [19] L. Whetsel, “Adapting Scan Architectures for Low Power
Operation,” Proc. IEEE Intl. Test Conf., pp. 863-872, 2000.
Tree-Impact Scan Segmentation (LCTI-SS). The LCTI-SS [20] Y. Bonhomme, et al., “A Gated Clock Scheme for Low Power Scan
scheme identifies an optimal combination of scan Testing of Logic ICs or Embedded Cores,” Proc. IEEE Asian Test
segments for simultaneous clocking so that shift switching Symp., pp. 253-258, 2001.
[21] P. Girard, et al., “A Modified Clock Scheme for a Low Power
activity in the proximities of clock trees is reduced. This BIST Test Pattern Generator,” Proc. IEEE Intl. Test Conf., pp. 652-
helps reduce IR-drop-induced shift clock skew which is 661, 2001.
becoming a major cause for scan shift failures, thus [22] P. Rosinger, et al., “Scan Architecture With Mutually Exclusive
Scan Segment Activation for Shift- and Capture-Power Reduction,”
helping improve shift safety in scan testing. IEEE Trans. Computer-Aided Design, Vol. 23, No. 7, pp. 1142-
Future work to further improve shift safety includes: (1) 1153, Jul. 2004.
[23] T. Yoshida, and M. Watari, “A New Approach for Low Power
evaluating whether the LCTI-SS scheme is sufficient to Scan Testing,” Proc. IEEE Intl. Test Conf., pp. 480-487, 2003.
totally avoid shift timing failures by using precise circuit- [24] A. Al-Yamani, E. Chmelar, and G. Grinchuck, "Segmented
level power analysis; and (2) finding a metric which addressable scan architecture," Proc. IEEE VLSI Test Symp., pp.
405- 411, 2005.
correlates more closely with IR-drop than WSA. [25] E G. Friedman, “Clock Distribution Networks in Synchronous
Digital Integrated Circuits,” Proc. of The IEEE, Vol. 89, No. 5, pp.
665–692, May 2001
Acknowledgments [26] K. Noda, et al., “Power and Noise Aware Test Using Preliminary
This work was partly supported by JSPS KAKENHI Estimation,” Proc. VLSI Design, Automation and Test, pp. 323-
326, 2009.
Grant-in-Aid for Scientific Research (B) 22300017. M. [27] IWLS 2005 Benchmarks,
Kochte was a Visiting Researcher at Kyushu Institute of http://www.iwls.org/iwls2005/benchmbenc.html
[28] X. Wen, et al., “Power-Aware Test Generation with Guaranteed
Technology in 2010, supported by the German Academic Launch Safety for At-Speed Scan Testing,” Proc. IEEE VLSI Test
Exchange Service (DAAD). Symp., pp. 166-171, 2011.

Paper 12.1 INTERNATIONAL TEST CONFERENCE 8

Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY BOMBAY. Downloaded on February 05,2025 at 04:10:04 UTC from IEEE Xplore. Restrictions apply.

You might also like