0% found this document useful (0 votes)
54 views78 pages

Lec 06

Uploaded by

prakash
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
54 views78 pages

Lec 06

Uploaded by

prakash
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 78

Introduction to Electronic

Design Automation
Jie-Hong Roland Jiang
江介宏

Department of Electrical Engineering


National Taiwan University

Spring 2014

Physical Design
High-level synthesis

Logic synthesis

Physical design

Slides are by Courtesy of Prof. Y.-W. Chang


2
Physical Design
 Physical design converts a circuit description into a geometric
description.
 The description is used to manufacture a chip.
 Physical design cycle:
1. Logic partitioning
2. Floorplanning and placement
3. Routing
4. Compaction
 Others: circuit extraction, timing verification and design rule
checking

Physical Design Flow

4
Outline
Partitioning

Floorplanning

Placement

Routing

Compaction

Circuit Partitioning
 Course contents:
 Kernighang-Lin partitioning algorithm

6
Circuit Partitioning
 Objective: Partition a circuit into parts such that every
component is within a prescribed range and the # of
connections among the components is minimized.
 More constraints are possible for some applications.
 Cutset? Cut size? Size of a component?

Problem Definition: Partitioning


 k-way partitioning: Given a graph G(V, E), where each
vertex v  V has a size s(v) and each edge e  E has a
weight w(e), the problem is to divide the set V into k disjoint
subsets V1, V2, …, Vk, such that an objective function is
optimized, subject to certain constraints.

 Bounded size constraint: The size of the i-th subset is


bounded by Bi (i.e., ).
 Is the partition balanced?

 Min-cut cost between two subsets:


Minimize  e( u ,v ) p ( u ) p ( v ) w(e) , where p(u) is the partition # of
node u.

 The 2-way, balanced partitioning problem is NP-complete,


even in its simple form with identical vertex sizes and unit
edge weights.

8
Kernighan-Lin Algorithm
 Kernighan and Lin, “An efficient heuristic procedure for
partitioning graphs,” The Bell System Technical Journal, vol.
49, no. 2, Feb. 1970.
 An iterative, 2-way, balanced partitioning (bi-sectioning)
heuristic.
 Till the cut size keeps decreasing
 Vertex pairs which give the largest decrease or the
smallest increase in cut size are exchanged.
 These vertices are then locked (and thus are prohibited
from participating in any further exchanges).
 This process continues until all the vertices are locked.
 Find the set with the largest partial sum for swapping.
 Unlock all vertices.

K-L Algorithm: A Simple Example


 Each edge has a unit weight.

 Questions: How to compute cost reduction? What pairs to


be swapped?
 Consider the change of internal & external connections.

10
Properties
 Two sets A and B such that |A| = n = |B| and A  B = .
 External cost of a  A: Ea =  vB cav.
 Internal cost of a  A: Ia =  vA cav.
 D-value of a vertex a: Da = Ea - Ia (cost reduction for moving a).
 Cost reduction (gain) for swapping a and b: gab = Da + Db - 2cab.
 If a  A and b  B are interchanged, then the new D-values, D’,
are given by

11

A Weighted Example

 Iteration 1

12
A Weighted Example (cont’d)
 Iteration 1:

 gxy = Dx + Dy - 2cxy.

 Swap b and f.
13

A Weighted Example (cont’d)

 D’x = Dx + 2 cxp - 2 cxq,  x  A – {p} (swap p and q, p  A, q  B)

 gxy = D’x + D’y - 2cxy.

 Swap c and e.
14
A Weighted Example (cont’d)

 D’’x = D’x + 2 cxp - 2 cxq,  x  A – {p}

 gxy = D’’x + D’’y - 2cxy.

 Note that this step is redundant

 Summary:

 Largest partial sum (k = 1)  Swap b and f.

15

A Weighted Example (cont’d)

 Iteration 2: Repeat what we did at Iteration 1


(Initial cost = 22-4 =18).

 Summary:

 Largest partial sum = (k = 3)  Stop!

16
Kernighan-Lin Algorithm

17

Time Complexity
Line 4: Initial computation of D: O(n2)
Line 5: The for-loop: O(n)
The body of the loop: O(n2).
 Lines 6--7: Step i takes (n – i + 1)2 time.
Lines 4--11: Each pass of the repeat loop:
O(n3).
Suppose the repeat loop terminates after r
passes.
The total running time: O(rn3).
 Polynomial-time algorithm?

18
Extensions of K-L Algorithm
 Unequal sized subsets (assume n1 < n2)
1. Partition: |A| = n1 and |B| = n2.
2. Add n2 - n1 dummy vertices to set A. Dummy vertices have no
connections to the original graph.
3. Apply the Kernighan-Lin algorithm.
4. Remove all dummy vertices.
 Unequal sized “vertices”
1. Assume that the smallest “vertex'' has unit size.
2. Replace each vertex of size s with s vertices which are fully
connected with edges of infinite weight.
3. Apply the Kernighan-Lin algorithm.
 k-way partition
1. Partition the graph into k equal-sized sets.
2. Apply the Kernighan-Lin algorithm for each pair of subsets.
3. Time complexity? Can be reduced by recursive bi-partition.

19

Outline
Partitioning

Floorplanning

Placement

Routing

Compaction

20
Floorplanning
 Course contents
 Floorplan basics
 Normalized Polish expression for slicing flooprlans
 B*-trees for non-slicing floorplans
 Reading
 Chapter 10

Pentium 4
PowerPC 604 21

Floorplanning
 Partitioning leads to
 Blocks with well-defined areas and shapes (rigid/hard
blocks).
 Blocks with approximate areas and no particular shapes
(flexible/soft blocks).
 A netlist specifying connections between the blocks.
 Objectives
 Find locations for all blocks.
 Consider shapes of soft block and pin locations of all the blocks.

22
Early Layout Decision Example

23

Early Layout Decision Methodology


 An integrated circuit is essentially a two-dimensional
medium; taking this aspect into account in early stages of
the design helps in creating designs of good quality.

 Floorplanning gives early feedback: thinking of layout at


early stages may suggest valuable architectural
modifications; floorplanning also aids in estimating delay
due to wiring.

 Floorplanning fits very well in a top-down design strategy,


the step-wise refinement strategy also propagated in
software design.

 Floorplanning assumes, however, flexibility in layout design,


the existence of cells that can adapt their shapes and
terminal locations to the environment.

24
Floorplanning Problem
 Inputs to the floorplanning problem:
 A set of blocks, hard or soft.
 Pin locations of hard blocks.
 A netlist.
 Objectives: minimize area, reduce wirelength for
(critical) nets, maximize routability (minimize
congestion), determine shapes of soft blocks, etc.

25

Floorplan Design

26
Floorplanning Concepts
 Leaf cell
(block/module): a
cell at the lowest level
of the hierarchy; it composite cell
does not contain any
other cell.
 Composite cell
(block/module): a
cell that is composed
of either leaf cells or
composite cells. The
entire IC is the
highest- level
composite cell.
leaf cell
27

Slicing Floorplan + Slicing Tree


 A composite cell’s subcells
are obtained by a H
horizontal or vertical
bisection of the composite
cell.
 Slicing floorplans can be V
represented by a slicing
tree.
 In a slicing tree, all cells H
(except for the top-level
cell) have a parent, and all
composite cells have
children.
 A slicing floorplan is also
called a floorplan of order H: horizontal cut
2. V: vertical cut
different from the definitions in the
textbook!!
28
Skewed Slicing Tree
 Rectangular dissection: Subdivision of a given rectangle by a
finite # of horizontal and vertical line segments into a finite # of
non-overlapping rectangles.
 Slicing structure: a rectangular dissection that can be obtained
by repetitively subdividing rectangles horizontally or vertically.
 Slicing tree: A binary tree, where each internal node represents
a vertical cut line or horizontal cut line, and each leaf a basic
rectangle.
 Skewed slicing tree: One in which no node and its right child
are the same.

29

Slicing Floorplan Design by


Simulated Annealing
Related work
 Wong & Liu, “A new algorithm for floorplan
design,” DAC-86.
Considers slicing floorplans.
 Wong & Liu, “Floorplan design for rectangular
and L-shaped modules,” ICCAD'87.
Also considers L-shaped modules.
 Wong, Leong, Liu, Simulated Annealing for
VLSI Design, pp. 31--71, Kluwer Academic
Publishers, 1988.

30
Simulated Annealing
 Kirkpatrick, Gelatt, and Vecchi, “Optimization by simulated
annealing,” Science, May 1983.
 Greene and Supowit, “Simulated annealing without rejected
moves,” ICCD-84.

31

Simulated Annealing Basics


 Non-zero probability for “up-hill” moves.
 Probability depends on
1. magnitude of the “up-hill” movement
2. total search time

 C = cost(S') - Cost(S)
 T: Control parameter (temperature)
 Annealing schedule: T=T0, T1, T2, …, where Ti =
ri T0 with r < 1.

32
Generic Simulated Annealing Algorithm
1 begin
2 Get an initial solution S;
3 Get an initial temperature T > 0;
4 while not yet “frozen” do
5 for 1  i  P do
6 Pick a random neighbor S' of S;
7   cost(S') - cost(S);
/* downhill move */
8 if   0 then S  S'
/* uphill move */
9 if  > 0 then S  S' with probability ;
10 T  rT; /* reduce temperature */
11 return S
12 end

33

Basic Ingredients for Simulated


Annealing
 Analogy:

 Basic Ingredients for Simulated Annealing:


 Solution space
 Neighborhood structure
 Cost function
 Annealing schedule

34
Solution Representation of Slicing
Floorplan
 An expression E = e1 e2… e2n-1, where ei  {1, 2, …, n, H, V}, 1  i 
2n-1, is a Polish expression of length 2n-1 iff
1. every operand j, 1  j  n, appears exactly once in E;
2. (the balloting property) for every subexpression Ei = e1 … ei, 1  i 
2n-1, # operands > # operators.

 Polish expression  Postorder traversal.


 ijH: rectangle i on bottom of j; ijV: rectangle i on the left of j.

35

Redundant Representations

 Question: How to eliminate ambiguous representation?

36
Normalized Polish Expression
 A Polish expression E = e1 e2 … e2n-1 is called
normalized iff E has no consecutive operators of
the same type (H or V), i.e. skewed.
 Given a normalized Polish expression, we can
construct a unique rectangular slicing structure.

37

Neighborhood Structure
 Chain: HVHVH … or VHVHV …

 Adjacent: 1 and 6 are adjacent operands; 2 and 7 are


adjacent operands; 5 and V are adjacent operand and
operator.
 3 types of moves:
 M1 (Operand Swap): Swap two adjacent operands.
 M2 (Chain Invert): Complement some chain (V = H, H = V).
 M3 (Operator/Operand Swap): Swap two adjacent operand
and operator.

38
Effects of Perturbation
4

1 2
3

 Question: The balloting property holds during the moves?


 M1 and M2 moves are OK.
 Check the M3 moves! Reject “illegal” M3 moves.
 Check M3 moves: Assume that the M3 move swaps the
operand ei with the operator ei+1, 1  i  k-1. Then, the
swap will not violate the balloting property iff 2Ni+1 < i.
 Nk: # of operators in the Polish expression E = e1 e2 … ek, 1 
k  2n-1

39

Cost Function
  = A +  W.
 A: area of the smallest rectangle
 W: overall wiring length
  : user-specified parameter

 W= ijcij dij.
 cij: # of connections between blocks i and j.
 dij: center-to-center distance between basic rectangles i and j.

40
Area Computation for Hard Blocks
 Allow rotation

 Wiring cost?
 Center-to-center interconnection length
41

Incremental Computation of Cost


Function
 Each move leads to only a minor modification of
the Polish expression.
 At most two paths of the slicing tree need to be
updated for each move.

42
Incremental Computation of Cost
Function (cont'd)

43

Annealing Schedule
 Initial solution: 12V3V … nV.

 Ti = ri T0, i = 1, 2, 3, …; r =0.85.
 At each temperature, try kn moves (k = 5-10).
 Terminate the annealing process if
 # of accepted moves < 5%,
 temperature is low enough, or
 run out of time.

44
Wong-Liu Algorithm
Input: (P, , r, k)
1 begin
2 E  12V3V4V … nV; /* initial solution */
3 Best  E; T0  ; M  MT  uphill  0; N = kn;
4 repeat
5 MT  uphill  reject  0;
6 repeat
7 SelectMove(M);
8 Case M of
9 M1: Select two adjacent operands ei and ej; NE  Swap(E, ei, ej);
10 M2: Select a nonzero length chain C; NE  Complement(E, C);
11 M3: done  FALSE;
12 while not (done) do
13 Select two adjacent operand ei and operator ei+1;
14 if (ei-1  ei+1) and (2 Ni+1 < i) then done  TRUE;
13’ Select two adjacent operator ei and operand ei+1;
14’ if (ei ei+2) then done  TRUE;
15 NE  Swap(E, ei, ei+1);
16 MT  MT+1; cost  cost(NE) - cost(E);

17 if (cost  0) or (Random < )


18 then
19 if (cost > 0) then uphill  uphill + 1;
20 E  NE;
21 if cost(E) < cost(best) then best  E;
22 else reject  reject + 1;
23 until (uphill > N) or (MT > 2N);
24 T  rT; /* reduce temperature */
25 until (reject/MT > 0.95) or (T < ) or OutOfTime;
26 end
45

Shape Curve
 Flexible cells imply that cells can have different aspect
ratios.
 The relation between the width x and the height y is: xy
= A, or y =A/x. The shape function is a hyperbola.
 Very thin cells are not interesting and often not feasible
to design. The shape function is a combination of a
hyperbola and two straight lines.
 Aspect ratio: r <= y/x <= s.
y = sx

y legal
y shapes
y = rx

x x
46
Shape Curve (cont’d)
 Leaf cells are built from discrete transistors: it is
not realistic to assume that the shape function
follows the hyperbola continuously.
 In an extreme case, a cell is rigid: it can only be
rotated and mirrored during floorplanning or
placement.

x
The shape function of a 2  4 inset cell.
47

Shape Curve (cont’d)


 In general, a piecewise linear function can be
used to approximate any shape function.
 The points where the function changes its
direction, are called the corner (break) points of
the piecewise linear function.

48
Addition for Vertical Abutment
 Composition by vertical abutment  the addition
of shape functions.
R1

R2

49

Deriving Shapes of Children


 A choice for the minimal shape of composite cell
fixes the shapes of the shapes of its children cells.

50
Sizing Algorithm for Slicing Floorplans
 The shape functions of all leaf cells are given as
piecewise linear functions.
 Traverse the slicing tree in order to compute the
shape functions of all composite cells (bottom-up
composition).
 Choose the desired shape of the top-level cell; as
the shape function is piecewise linear, only the
break points of the function need to be evaluated,
when looking for the minimal area.
 Propagate the consequences of the choice down
to the leaf cells (top-down propagation).
 The sizing algorithm runs in polynomial time for
slicing floorplans
 NP-complete for non-slicing floorplans

51

Feasible Implementations
 Shape curves correspond to different kinds of constraints
where the shaded areas are feasible regions.

52
Wheel or Spiral Floorplan
 This floorplan is not slicing!
 Wheel is the smallest non-
slicing floorplans.
 Limiting floorplans to those
that have the slicing
property is reasonable: it
certainly facilitates
floorplanning algorithms.
 Taking the shape of a
wheel floorplan and its
mirror image as the basis
of operators leads to
hierarchical descriptions of
order 5.

53

Order-5 Floorplan Examples

V H

54
General Floorplan Representation:
Polar Graphs
horizontal polar graph
 vertex: channel segment
 edge: cell/block/module

vertical polar graph

55

B*-Tree: Compacted Floorplan


Representation
 Chang et al., “B*-tree: A new representation for non-slicing
floorplans,” DAC 2000.
 Compact modules to left and bottom
 Construct an ordered binary tree (B*-tree)
 Left child: the lowest, adjacent block on the right (xj = xi+wi)
 Right child: the first block above, with the same x-coordinate (xj =
xi)

6 6 n1
5 5 n2 n3
3 4 3 4
n4 n6
2 1 2
1
n5
A non-slicing floorplan Compact to left and down B*-tree
56
B*-tree Packing
 x-coordinates can be determined by the tree structure
 Left child: the lowest, adjacent block on the right (xj = xi+wi)
 Right child: the first block above, with the same x-coordinate
(xj = xi)
 Y-coordinates?
 Horizontal contour: Use a doubly linked list to record the
current maximum y-coordinate for each x-range
 Reduce the complexity of computing a y-coordinate to
amortized O(1) time

6
n1 x1
5
3 4 x2 = x1 + w1 n2 n3 x3 = x1
x3 = x1
2 x4 = x3 + w3 n4 n6 x6 = x3
1
(x1, y1) w1 x5 = x4 + w4 n5
x2 = x1 + w1
57

Contour Data Structure


3
(0, 6)
2 2
1 1 1

(0, 0) (0, 0) (9, 0) (0, 0) (9, 0)


C = <(0,0), (0,6), C = <(0,0), (0,6), (9,6), C = <(0,0), (0,12),
(9,6), (9,0), (9,8), (15,8), (15,0), (3,12), (3,6), (9,6), (9,8),
(∞,0)> (∞,0)> (15,8), (15,0), (∞,0)>
6
(0, 13)
5 5
3 4 3 4 (6, 8) 3 4 (6, 8)

(0, 6) (0, 6) (0, 6)


(3, 6) (3, 6) (3, 6)
2 2 2
1 1 1

(0, 0) (9, 0) (0, 0) (9, 0) (0, 0) (9, 0)


C = <(0,0), (0,12), C = <(0,0), (0,12),
(3,12), (3,13), (6,13), (3,12), (3,13), (6,13), C = <(0,0), (0,15),
(6,6), (9,6), (9,8), (12,13), (12,8), (15,8), (12,15), (12,13), (12,8), 58
(15,8), (15,0), (∞,0)> (15,0), (∞,0)> (15,8), (15,0), (∞,0)>
B*-tree Perturbation
 Op1: rotate a macro
 Op2: move a node to another place
 Op3: swap two nodes
6 n1 6 n1

5 Op1 5
4 n2 n3 4 n2 n3
3
3
n4 n6 n4 n6
2 2 Op2
1 n5 1 n5

6 Op3
n1 n2
4 6
5 n2 n3 3 4 n1 n3
5
3
n5 n6 n5 n6
2 2
1 n4 1 n4
59

Simulated Annealing Using B*-tree


 The cost function is
based on problem
requirements

60
Strengths of B*-tree
 Binary tree based, efficient and easy
 Flexible to deal with various placement constraints by
augmenting the B*-tree data structure (e.g., preplaced,
symmetry, alignment, bus position) and rectilinear modules
 Transformation between a tree and its placement takes
only linear time
 Operate on only one B*-tree (vs. two O-trees)
 Can evaluate area cost incrementally
 Smaller solution space: only O(n! 4n/n1.5) combinations
 Directly corresponds to hierarchical and multilevel
frameworks for large-scale floorplan designs
 Can be extended to 3D floorplanning & related applications

61

Weaknesses of B*-tree
 Representation may
n1
change after packing 3
 Only a partially n2 n3
topological
2
representation; less n4 1
flexible than a fully 4
topological
representation
 B*-tree can represent
only compacted 3
2
placement B*-tree??

4
1
62
Outline
Partitioning

Floorplanning

Placement

Routing

Compaction

63

Placement
 Course contents:
 Placement metrics
 Constructive placement: cluster growth, min cut
 Iterative placement: force-directed method, simulated
annealing
 Reading
 Chapter 11

64
Placement
 Placement is the problem of automatically assigning
correct positions on the chip to predesigned cells, such that
some cost function is optimized.
 Inputs: A set of fixed cells/modules, a netlist.
 Goal: Find the best position for each cell/module on the
chip according to appropriate cost functions.
 Considerations: routability/channel density, wirelength,
cut size, performance, thermal issues, I/O pads.

65

Placement Objectives and Constraints


 What does a placement algorithm try to optimize?
 total area
 total wire length
 number of horizontal/vertical wire segments crossing a line
 Constraints:
 placement should be routable (no cell overlaps; no density
overflow).
 timing constraints are met (some wires should always be
shorter than a given length).

66
VLSI Placement: Building Blocks
 Different design styles create different placement
problems.
 E.g., building-block, standard-cell, gate-array placement
 Building block: The cells to be placed have arbitrary
shapes.

building block example

67

VLSI Placement: Standard Cells


 Standard cells are designed in such a way that power and
clock connections run horizontally through the cell and
other I/O leaves the cell from the top or bottom sides.
 The cells are placed in rows.
 Sometimes feedthrough cells are added to ease wiring.

feedthrough

68
Consequences of Fabrication Method
 Full-custom fabrication (building block):
 Free selection of aspect ratio (quotient of height and width).
 Height of wiring channels can be adapted to necessity.
 Semi-custom fabrication (gate array, standard cell):
 Placement has to deal with fixed carrier dimensions.
 Placement should be able to deal with fixed channel capacities.

gate array

69

Relation with Routing


Ideally, placement and routing should be
performed simultaneously as they depend
on each other’s results. This is, however,
too complicated.
 P&R: placement and routing
In practice placement is done prior to
routing. The placement algorithm
estimates the wire length of a net using
some metric.

70
Wirelength Estimation
 Semi-perimeter method: Half the perimeter of the bounding
rectangle that encloses all the pins of the net to be connected.
Most widely used approximation!
 Steiner-tree approximation: Computationally expensive.
 Minimum spanning tree: Good approximation to Steiner trees.
 Squared Euclidean distance: Squares of all pairwise terminal
distances in a net using a quadratic cost function

n( n  1) 
 Complete graph: Since #edges in a complete graph is   ,
 2 
2
wirelength  (i, j)  netdist(i, j).
n

71

Wirelength Estimation (cont'd)

72
Placement Algorithms
 The placement problem is NP-complete
 Popular placement algorithms:
 Constructive algorithms: once the position of a cell is fixed,
it is not modified anymore.
 Cluster growth, min cut, etc.
 Iterative algorithms: intermediate placements are modified
in an attempt to improve the cost function.
 Force-directed method, etc
 Nondeterministic approaches: simulated annealing, genetic
algorithm, etc.
 Most approaches combine multiple elements:
 Constructive algorithms are used to obtain an initial
placement.
 The initial placement is followed by an iterative improvement
phase.
 The results can further be improved by simulated annealing.

73

Bottom-Up Placement: Clustering


Starts with a single cell and finds more
cells that share nets with it.

74
Placement by Cluster Growth
 Greedy method: Selects unplaced components and places
them in available slots.
 SELECT: Choose the unplaced component that is most
strongly connected to all of the placed components (or
most strongly connected to any single placed
component).
 PLACE: Place the selected component at a slot such that
a certain “cost” of the partial placement is minimized.

75

Cluster Growth Example


 # of other terminals connected: ca=3, cb=1, cc=1, cd =1,
ce=4, cf=3, and cg=3  e has the most connectivity.
 Place e in the center, slot 4. a, b, g are connected to e, and
 Place a next to e (say, slot 3). Continue until all cells are
placed.
 Further improve the placement by swapping the gates.

76
Top-down Placement: Min Cut
 Starts with the whole circuit and ends with small
circuits.
 Recursive bipartitioning of a circuit (e.g., K&L)
leads to a min-cut placement.

77

Min-Cut Placement
 Breuer, “A class of min-cut placement algorithms,” DAC, 1977.
 Quadrature: suitable for circuits with high density in the
center.
 Bisection: good for standard-cell placement.
 Slice/Bisection: good for cells with high interconnection on
the periphery.

78
Algorithm for Min-Cut Placement
Algorithm: Min_Cut_Placement(N, n, C)
/* N: the layout surface */
/* n : # of cells to be placed */
/* n0: # of cells in a slot */
/* C: the connectivity matrix */

1 begin
2 if (n  n0) then PlaceCells(N, n, C)
3 else
4 (N1, N2)  CutSurface(N);
5 (n1, C1), (n2, C2)  Partition(n, C);
6 Call Min_Cut_Placement(N1, n1, C1);
7 Call Min_Cut_Placement(N2, n2, C2);
8 end

79

Quadrature Placement Example


 Apply the K-L heuristic to partition + Quadrature
Placement: Cost C1 = 4, C2L= C2R = 2, etc.

80
Min-Cut Placement with Terminal
Propagation
 Dunlop & Kernighan, “A procedure for placement of
standard-cell VLSI circuits,” IEEE TCAD, Jan. 1985.
 Drawback of the original min-cut placement: Does not
consider the positions of terminal pins that enter a region.
 What happens if we swap {1, 3, 6, 9} and {2, 4, 5, 7}
in the previous example?

81

Terminal Propagation
 We should use the fact that s is in L1!

 When not to use p to bias partitioning? Net s has cells in


many groups?

82
Terminal Propagation Example
 Partitioning must be done breadth-first, not
depth-first.

83

General Procedure for Iterative


Improvement
Algorithm: Iterative_Improvement()
1 begin
2 s  initial_configuration();
3 c  cost(s);
4 while (not stop()) do
5 s’  perturb(s);
6 c’  cost(s’);
7 if (accept(c, c’))
8 then s  s’;
9 end

84
Placement by the Force-Directed
Method
 Hanan & Kurtzberg, “Placement techniques,” in Design
Automation of Digital Systems, Breuer, Ed, 1972.
 Quinn, Jr. & Breuer, “A force directed component placement
procedure for printed circuit boards,” IEEE Trans. Circuits and
Systems, June 1979.
 Reduce the placement problem to solving a set of simultaneous
linear equations to determine equilibrium locations for cells.
 Analogy to Hooke's law: F = kd, F: force, k: spring constant, d:
distance.
 Goal: Map cells to the layout surface.

85

Finding the Zero-Force Target Location


 Cell i connects to several cells j's at distances dij's by wires of weights
wij's. Total force: Fi = jwijdij
 The zero-force target location ( , ) can be determined by equating
the x- and y-components of the forces to zero:

 In the example, and = 1.50.

86
Force-Directed Placement
Can be constructive or iterative:
 Start with an initial placement.
 Select a “most profitable” cell p (e.g.,
maximum F, critical cells) and place it in its
zero-force location.
 “Fix” placement if the zero-location has been
occupied by another cell q.
Popular options to fix:
 Ripple move: place p in the occupied location,
compute a new zero-force location for q, …
 Chain move: place p in the occupied location, move q
to an adjacent location, …
 Move p to a free location close to q.

87

Force-Directed Placement

88
Placement by Simulated Annealing
 Sechen and Sangiovanni-Vincentelli, “The TimberWolf
placement and routing package,” IEEE J. Solid-State
Circuits, Feb. 1985; “TimberWolf 3.2: A new standard cell
placement and global routing package,” DAC-86.
 TimberWolf: Stage 1
 Modules are moved between different rows as well as
within the same row.
 Module overlaps are allowed.
 When the temperature is reached below a certain value,
stage 2 begins.
 TimberWolf: Stage 2
 Remove overlaps.
 Annealing process continues, but only interchanges
adjacent modules within the same row.

89

Solution Space & Neighborhood


Structure
 Solution Space: All possible arrangements of
the modules into rows, possibly with overlaps.
 Neighborhood Structure: 3 types of moves
 M1: Displace a module to a new location.
 M2: Interchange two modules.
 M3: Change the orientation of a module.

90
Neighborhood Structure
 TimberWolf first tries to select a move between M1 and M2:
Prob(M1) = 0.8, Prob(M2) = 0.2.
 If a move of type M1 is chosen and it is rejected, then a move of
type M3 for the same module will be chosen with probability 0.1.
 Restrictions: (1) what row for a module can be displaced? (2)
what pairs of modules can be interchanged?
 Key: Range Limiter
 At the beginning, (WT, HT) is big enough to contain the whole chip.
 Window size shrinks as temperature decreases. Height & width 
log(T).
 Stage 2 begins when window size is so small that no inter-row module
interchanges are possible.

91

Cost Function
 Cost function: C = C1 + C2 + C3.
 C1: total estimated wirelength.
 C1 =  i  Nets(i wi + i hi)
 i, i are horizontal and vertical weights, respectively. (i=1,
i =1  half perimeter of the bounding box of Net i.)
 Critical nets: Increase both i and i .
 If vertical wirings are “cheaper” than horizontal wirings, use
smaller vertical weights: i < i.
 C2: penalty function for module overlaps.
 C2 =   i  j O2ij, : penalty weight.
 Oij: amount of overlaps in the x-dimension between modules i
and j.
 C3: penalty function that controls the row length.
 C2 =  r  Rows|Lr - Dr|,  : penalty weight.
 Dr: desired row length.
 Lr: sum of the widths of the modules in row r.

92
Annealing Schedule
Tk = rk Tk-1, k = 1, 2, 3, …
rk increases from 0.8 to max value 0.94
and then decreases to 0.8.
At each temperature, a total # of nP
attempts is made.
n: # of modules; P: user specified
constant.
Termination: T < 0.1.

93

Outline
 Partitioning

 Floorplanning

 Placement

 Routing
 Global rounting
 Detailed routing

 Compaction

94
Routing
Course contents:
 Global routing
 Detail routing
Reading
 Chapter 12

Filling 95

Routing

96
Routing Constraints
 100% routing completion + area minimization, under a set
of constraints:
 Placement constraint: usually based on fixed placement
 Number of routing layers
 Geometrical constraints: must satisfy design rules
 Timing constraints (performance-driven routing): must satisfy
delay constraints
 Crosstalk?
 Process variations?

97

Classification of Routing

98
Maze Router: Lee Algorithm
 Lee, “An algorithm for path connection and its
application,” IRE Trans. Electronic Computer, EC-
10, 1961.
 Discussion mainly on single-layer routing
 Strengths
 Guarantee to find connection between 2 terminals
if it exists.
 Guarantee minimum path.
 Weaknesses
 Requires large memory for dense layout.
 Slow.
 Applications: global routing, detailed routing

99

Lee Algorithm
 Find a path from S to T by “wave propagation”.

Filling
 Time & space complexity for an M  N grid: O(MN) (huge!)

100
Reducing Memory Requirement
 Akers's Observations (1967)
 Adjacent labels for k are either k-1 or k+1.
 Want a labeling scheme such that each label has its preceding label
different from its succeeding label.
 Way 1: coding sequence 1, 2, 3, 1, 2, 3, …; states: 1, 2, 3, empty,
blocked (3 bits required)
 Way 2: coding sequence 1, 1, 2, 2, 1, 1, 2, 2, …; states: 1, 2, empty,
blocked (need only 2 bits)

101

Reducing Running Time


 Starting point selection: Choose the point farthest from the
center of the grid as the starting point.
 Double fan-out: Propagate waves from both the source and
the target cells.
 Framing: Search inside a rectangle area 10--20% larger
than the bounding box containing the source and target.
 Need to enlarge the rectangle and redo if the search fails.

102
Hadlock's Algorithm
 Hadlock, “A shortest path algorithm for grid graphs,”
Networks, 1977.
 Uses detour number (instead of labeling wavefront in
Lee's router)
 Detour number, d(P): # of grid cells directed away
from its target on path P.
 MD(S, T): the Manhattan distance between S and T.
 Path length of P, l(P): l(P) = MD(S, T) + 2 d(P).
 MD(S, T) fixed!  Minimize d(P) to find the shortest
path.
 For any cell labeled i, label its adjacent unblocked cells
away from T i+1; label i otherwise.
 Time and space complexities: O(MN), but
substantially reduces the # of searched cells.
 Finds the shortest path between S and T.

103

Hadlock's Algorithm (cont'd)


 d(P): # of grid cells directed away from its target on path P.
 MD(S, T): the Manhattan distance between S and T.
 Path length of P, l(P): l(P) = MD(S, T) + 2d(P).
 MD(S, T) fixed!  Minimize d(P) to find the shortest path.
 For any cell labeled i, label its adjacent unblocked cells away
from T i+1; label i otherwise.

104
Soukup's Algorithm
 Soukup, “Fast maze router,” DAC-78.
 Combined breadth-first and depth-first search.
 Depth-first (line) search is first directed toward target T until
an obstacle or T is reached.
 Breadth-first (Lee-type) search is used to “bubble” around an
obstacle if an obstacle is reached.
 Time and space complexities: O(MN), but 10~50 times faster
than Lee's algorithm.
 Find a path between S and T, but may not be the shortest!

105

Mikami-Tabuchi's Algorithm
 Mikami & Tabuchi, “A computer program for optimal routing
of printed circuit connectors,” IFIP, H47, 1968.
 Every grid point is an escape point.

106
Hightower's Algorithm
 Hightower, “A solution to line-routing problem on the
continuous plane,” DAC-69.
 A single escape point on each line segment.
 If a line parallels to the blocked cells, the escape point is
placed just past the endpoint of the segment.

107

Global Routing Graph


Each cell is represented by a vertex.
Two vertices are joined by an edge if the
corresponding cells are adjacent to each
other.

108
Global-Routing Problem
 Given a netlist N={N1, N2, …, Nn }, a routing
graph G=(V,E), find a Steiner tree Ti for each net
Ni, 1  i  n, such that U(ej)  c(ej),  ej  E and
i L(Ti) is minimized, where
 c(ej): capacity of edge ej
 xij=1 if ej is in Ti; xij=0 otherwise
 U(ej) = i xij:  of wires that pass through the channel
corresponding to edge ej
 L(Ti): total wirelength of Steiner tree Ti
 For high performance, the maximum wirelength
maxi L(Ti) is minimized (or the longest path
between two points in Ti is minimized).

109

Classification of Global-Routing
Algorithms
 Sequential approach:
 Select a net order and route nets sequentially in the
order
 Earlier routed nets might block the routing of
subsequent nets
 Routing quality heavily depends on net ordering
 Strategy: Heuristic net ordering + rip-up and rerouting
 Concurrent approach:
 All nets are considered simultaneously
 E.g., 0-1 integer linear programming (0-1 ILP)

110
Net Ordering
 Net ordering greatly affects routing solutions.
 In the example, we should route net b before net a.

111

Net Ordering (cont’d)


Order the nets in the ascending order of
the # of pins within their bounding boxes.
Order the nets in the ascending
(descending) order of their lengths if
routability (timing) is the most critical
metric.
Order the nets based on their timing
criticality.

112
Rip-Up and Re-routing
 Rip-up and re-routing is required if a global or
detailed router fails in routing all nets.
 Approaches: the manual approach? the automatic
procedure?
 Two steps in rip-up and re-routing
1. Identify bottleneck regions, rip off some already routed
nets.
2. Route the blocked connections, and re-route the ripped-
up connections.
 Repeat the above steps until all connections are
routed or a time limit is exceeded.

113

Top-down Hierarchical Global Routing


 Recursively divides routing regions into
successively smaller super cells, and nets at
each hierarchical level are routed sequentially or
concurrently.

114
Bottom-up Hierarchical Global Routing
 At each hierarchical level, routing is restrained
within each super cell individually.
 When the routing at the current level is finished,
every four super cells are merged to form a new
larger super cell at the next higher level.

115

Hybrid Hierarchical Global Routing


 (1) neighboring propagation, (2) preference
partitioning, and (3) bounded routing

116
The Routing-Tree Problem
 Problem: Given a set of pins of a net, interconnect the pins by a
“routing tree.”

 Minimum Rectilinear Steiner Tree (MRST) Problem: Given n


points in the plane, find a minimum-length tree of rectilinear
edges which connects the points.
 MRST(P) = MST(P  S), where P and S are the sets of original
points and Steiner points, respectively.

117

Theoretical Results for the MRST


Problem
 Hanan’s Thm: There exists an MRST with all Steiner points (set
S) chosen from the intersection points of horizontal and vertical
lines drawn points of P.
 Hanan, “On Steiner's problem with rectilinear distance,” SIAM
J. Applied Math., 1966.
 Hwang’s Theorem: For any point set P,

 Hwang, “On Steiner minimal tree with rectilinear distance,”


SIAM J. Applied Math., 1976.
 Best existing approximation algorithm: Performance bound 61/48
by Foessmeier et al.

118
Coping with the MRST Problem
 Ho, Vijayan, Wong, “New algorithms for the rectilinear
Steiner problem,”
1. Construct an MRST from an MST.
2. Each edge is straight or L-shaped.
3. Maximize overlaps by dynamic programming.
 About 8% smaller than Cost(MST).

119

Iterated 1-Steiner Heuristic for MRST


 Kahng & Robins, “A new class of Steiner tree heuristics with good
performance: the iterated 1-Steiner approach,” ICCAD-90.
Algorithm: Iterated_1-Steiner(P)
P: set of n points.
1 begin
2 S  ;
/* H(P  S): set of Hanan points */
/* MST(A, B) = Cost(MST(A)) - Cost(MST(A  B)) */
3 while (Cand  {x  H(P  S)|  MST(P  S, {x}) > 0 }   ) do
4 Find x  C and which maximizes  MST(P  S), {x});
5 S  S  {x};
6 Remove points in S which have degree  2 in MST(P  S);
7 return MST(P  S);
8 end

120
Outline
 Partitioning

 Floorplanning

 Placement

 Routing
 Global rounting
 Detailed routing

 Compaction

121

Channel Routing
 In earlier process technologies, channel routing
was pervasively used since most wires were
routed in the free space (i.e., routing channel)
between a pair of logic blocks (cell rows)

122
Routing Region Decomposition
There are often various ways to
decompose a routing region.
The order of routing regions significantly
affects the channel-routing process.

123

Routing Models
 Grid-based model:
 A grid is super-imposed on the routing region.
 Wires follow paths along the grid lines.
 Pitch: distance between two gridded lines
 Gridless model:
 Any model that does not follow this “gridded” approach.

124
Models for Multi-Layer Routing
 Unreserved layer model: Any net segment is
allowed to be placed in any layer.

 Reserved layer model: Certain type of


segments are restricted to particular layer(s).
 Two-layer: HV (Horizontal-Vertical), VH
 Three-layer: HVH, VHV

125

Terminology for Channel Routing


 Local density at
column i, d(i): total
# of nets that
crosses column i.
 Channel density:
maximum local
density
 # of horizontal
tracks required 
channel density.

126
Channel Routing Problem
 Assignments of horizontal segments of nets to tracks.
 Assignments of vertical segments to connect the following:
 horizontal segments of the same net in different tracks, and
 terminals of the net to horizontal segments of the net.
 Horizontal and vertical constraints must not be violated
 Horizontal constraints between two nets: the horizontal span
of two nets overlaps each other.
 Vertical constraints between two nets: there exists a column
such that the terminal on top of the column belongs to one net
and the terminal on bottom of the column belongs to another
net.
 Objective: Channel height is minimized (i.e., channel area
is minimized).

127

Horizontal Constraint Graph (HCG)


 HCG G = (V, E) is undirected graph where
 V = { vi | vi represents a net ni}
 E = {(vi, vj)| a horizontal constraint exists between ni
and nj}.

 For graph G: vertices  nets; edge (i, j)  net i overlaps


net j.

128
Vertical Constraint Graph (VCG)
 VCG G = (V, E) is directed graph where
 V = { vi | vi represents a net ni}
 E = {(vi, vj)| a vertical constraint exists between
ni and nj}.
 For graph G: vertices  nets; edge i j  net i
must be above net j.

129

2-Layer Channel Routing:


Basic Left-Edge Algorithm
 Hashimoto & Stevens, “Wire routing by optimizing channel
assignment within large apertures,” DAC-71.
 No vertical constraint.
 HV-layer model is used.
 Doglegs are not allowed.
 Treat each net as an interval.
 Intervals are sorted according to their left-end x-
coordinates.
 Intervals (nets) are routed one-by-one according to the
order.
 For a net, tracks are scanned from top to bottom, and the
first track that can accommodate the net is assigned to the
net.
 Optimality: produces a routing solution with the minimum
# of tracks (if no vertical constraint).

130
Basic Left-Edge Algorithm
Algorithm: Basic_Left-Edge(U, track[j])
U: set of unassigned intervals (nets) I1, …, In;
Ij=[sj, ej]: interval j with left-end x-coordinate sj and right-end ej;
track[j]: track to which net j is assigned.

1 begin
2 U  {I1, I2 , …, In};
3 t  0;
4 while (U   ) do
5 t  t + 1;
6 watermark  0;
7 while (there is an Ij  U s.t. sj > watermark) do
8 Pick the interval Ij  U with sj > watermark,
nearest watermark;
9 track[j]  t;
10 watermark  ej;
11 U  U - {Ij};
12 end

131

Basic Left-Edge Example


 U = {I1, I2, …, I6}; I1 = [1, 3], I2 = [2, 6], I3 = [4, 8], I4 = [5,
10], I5 = [7, 11], I6 = [9, 12].
 t =1:
 Route I1: watermark = 3;
 Route I3 : watermark = 8;
 Route I6: watermark = 12;
 t = 2:
 Route I2 : watermark = 6;
 Route I5 : watermark = 11;
 t = 3: Route I4

132
Basic Left-Edge Algorithm
 If there is no vertical
constraint, the basic
left-edge algorithm is
optimal.
 If there is any vertical
constraint, the
algorithm no longer
guarantees optimal
solution.

133

Constrained Left-Edge Algorithm


Algorithm: Constrained_Left-Edge(U, track[j])
U: set of unassigned intervals (nets) I1, …, In;
Ij=[sj, ej]: interval j with left-end x-coordinate sj and right-end ej;
track[j]: track to which net j is assigned.

1 begin
2 U  { I1, I2, …, In};
3 t  0;
4 while (U  ) do
5 t  t + 1;
6 watermark  0;
7 while (there is an unconstrained Ij  U s.t. sj > watermark) do
8 Pick the interval Ij  U that is unconstrained,
with sj > watermark, nearest watermark;
9 track[j]  t;
10 watermark  ej;
11 U  U - {Ij};
12 end

134
Constrained Left-Edge Example
 I1 = [1, 3], I2 = [1, 5], I3 = [6, 8], I4 = [10, 11], I5= [2,
6], I6 = [7, 9].
 Track 1: Route I1 (cannot route I3); Route I6; Route I4.
 Track 2: Route I2; cannot route I3.
 Track 3: Route I5.
 Track 4: Route I3.

135

Dogleg Channel Router


 Deutch, “A dogleg channel router,” 13rd DAC, 1976.
 Drawback of Left-Edge: cannot handle the cases with
constraint cycles.

 Drawback of Left-Edge: the entire net is on a single track.


 Doglegs are used to place parts of a net on different tracks to
minimize channel height.
 Might incur penalty for additional vias.

136
Dogleg Channel Router
 Each multi-pin net is broken into a set of 2-pin nets.
 Modified Left-Edge Algorithm is applied to each subnet.

137

Dogleg Channel Routing Example

138
Modern Routing Considerations
 Signal/power Integrity
 Capacitive crosstalk
 Inductive crosstalk
 IR drop
 Manufacturability
 Process variation
 Optical proximity correction (OPC)
 Chemical mechanical polishing (CMP)
 Phase-Shift Mask (PSM)
 Reliability
 Double via insertion
 Process antenna effect
 Electromigration (EM)
 Electrostatic discharge (ESD)

139

Outline
Partitioning

Floorplanning

Placement

Routing

Compaction

140
Layout Compaction
Course contents
 Design rules
 Symbolic layout
 Constraint-graph compaction

141

Design Rules
 Design rules: restrictions  Patterns and design rules
on the mask patterns to are often expressed in 
increase the probability of rules.
successful fabrication.  Most common design
rules:
 minimum-width rules
(valid for a mask pattern
of a specific layer): (a).
 minimum-separation rules
(between mask patterns of
the same layer or different
layers): (b), (c), (d).
 minimum-overlap rules
(mask patterns in different
layers): (e).

142
CMOS Inverter Layout Example

p/n diffusion
polysilicon
contact cut
metal
Symbolic layout
Geometric layout
143

Symbolic Layout
 Geometric (mask) layout: coordinates of the layout
patterns (rectangles) are absolute (or in multiples of ).
 Symbolic (topological) layout: only relations between layout
elements (below, left to, etc.) are known.
 Symbols are used to represent elements located in several
layers, e.g. transistors, contact cuts.
 The length, width or layer of a wire or other layout element
might be left unspecified.
 Mask layers not directly related to the functionality of the
circuit do not need to be specified, e.g. n-well, p-well.
 The symbolic layout can work with a technology file that
contains all design rule information for the target
technology to produce the geometric layout.

144
Compaction and its Applications
 A compaction program or compactor generates
layout at the mask level. It attempts to make the
layout as dense as possible.
 Applications of compaction:
 Area minimization: remove redundant space in
layout at the mask level.
 Layout compilation: generate mask-level layout
from symbolic layout.
 Redesign: automatically remove design-rule
violations.
 Rescaling: convert mask-level layout from one
technology to another.

145

Aspects of Compaction
 Dimension:
 1-dimensional (1D) compaction: layout
elements only are moved or shrunk in one
dimension (x or y direction).
Is often performed first in the x-dimension and then
in the y-dimension (or vice versa).
 2-dimensional (2D) compaction: layout
elements are moved and shrunk
simultaneously in two dimensions.
 Complexity:
 1D compaction can be done in polynomial
time.
 2D compaction is NP-hard.

146
1D Compaction: X Followed By Y
 Each square is 2  * 2 , minimum separation is
1 .
 Initially, the layout is 11  * 11 .
 After compacting along the x direction, then the y
direction, we have the layout size of 8  * 11 .

147

1D Compaction: Y Followed By X
 Each square is 2  * 2 , minimum separation is
1 .
 Initially, the layout is 11  * 11 .
 After compacting along the y direction, then the x
direction, we have the layout size of 11  * 8 .

148
2D Compaction
 Each square is 2  * 2 , minimum separation is 1 .
 Initially, the layout is 11  * 11 .
 After 2D compaction, the layout size is only 8  * 8 .

 Since 2D compaction is NP-complete, most compactors are


based on repeated 1D compaction.
149

Inequalities for Distance Constraints


 Minimum-distance  For example, if the
design rules can be minimum width is a
expressed as and the minimum
inequalities. separation is b, then
xj – xi  dij. x2 – x1  a
x3 – x2  b
x3 – x6  b

150
The Constraint Graph
 The inequalities can be used to construct a constraint graph
G(V, E):
 There is a vertex vi for each variable xi.
 For each inequality xj – xi  dij there is an edge (vi, vj) with
weight dij .
 There is an extra source vertex, v0; it is located at x = 0 ; all
other vertices are at its right.
 If all the inequalities express minimum-distance
constraints, the graph is acyclic (DAG).
 The longest path in a constraint graph determines the
layout dimension.

constraint graph
151

Maximum-Distance Constraints
 Sometimes the distance of layout elements is bounded by a
maximum, e.g., when the user wants a maximum wire
width, maintains a wire connecting to a via, etc.
 A maximum distance constraint gives an inequality of the
form: xj – xi  cij or xi – xj  -cij
 Consequence for the constraint graph: backward edge
 (vj, vi) with weight dji = -cij; the graph is not acyclic anymore.
 The longest path in a constraint graph determines the
layout dimension.

152
Longest-Paths in Cyclic Graphs
 Constraint-graph compaction with maximum-distance
constraints requires solving the longest-path problem in
cyclic graphs.
 Two cases are distinguished:
 There are positive cycles: No bounded solution for
longest paths. (The inequality constraints are
conflicting.) We shall detect the cycles.
 All cycles are negative: Polynomial-time algorithms
exist.

153

Longest and Shortest Paths


 Longest paths become shortest paths and vice
versa when edge weights are multiplied by –1.
 Situation in DAGs: both the longest and shortest
path problems can be solved in linear time.
 Situation in cyclic directed graphs:
 All weights are positive: shortest-path problem in P
(Dijkstra), no feasible solution for the longest-path
problem.
 All weights are negative: longest-path problem in P
(Dijkstra), no feasible solution for the shortest-path
problem.
 No positive cycles: longest-path problem is in P.
 No negative cycles: shortest-path problem is in P.

154
Remarks on Constraint-Graph
Compaction
 Noncritical layout elements: Every element outside the
critical paths has freedom on its best position => may use
this freedom to optimize some cost function.
 Automatic jog insertion: The quality of the layout can
further be improved by automatic jog insertion.

 Hierarchy: A method to reduce complexity is hierarchical


compaction, e.g., consider cells only.

155

Constraint Generation
 The set of constraints should be irredundant and
generated efficiently.
 An edge (vi, vj) is redundant if edges (vi, vk) and (vk, vj)
exist and w((vi, vj))  w((vi, vk)) + w((vk, vj)).
 The minimum-distance constraints for (A, B) and (B, C)
make that for (A, C) redundant.

 Doenhardt and Lengauer have proposed a method for


irredundant constraint generation with complexity O(n log
n).
156

You might also like