Short Papers: An Efficient and Regular Routing Methodology For Datapath Designs Using Net Regularity Extraction
Short Papers: An Efficient and Regular Routing Methodology For Datapath Designs Using Net Regularity Extraction
1, JANUARY 2002
93
Short Papers_______________________________________________________________________________
An Efficient and Regular Routing Methodology for
Datapath Designs Using Net Regularity Extraction
Sabyasachi Das and Sunil P. Khatri
paths. To the best of our knowledge, there has been no other research
on detailed routing for datapaths.
In this paper, we propose a new detailed routing methodology that
exploits the regularity of connections in a datapath circuit. In our
scheme, we route all the regular nets in a similar fashion so as to
ensure good quality, regular routes. This results in highly predictable
timing characteristics of the resulting design and the routing process
is much faster than other conventional routers.
We have organized the rest of the paper as follows: Section II
presents general characteristics and some definitions of a datapath.
In Section III, we discuss our proposed flow. Section IV presents the
advantages of our approach. Experimental results are provided in
Section V and conclusions are drawn in Section VI.
II. CHARACTERISTICS OF DATAPATHS
Datapaths are commonly found in microprocessors, digital signal
processors, and graphics integrated circuits. In datapaths, the same
logic is repeated multiple times. We define a bit slice as the logic
corresponding to a particular bit. In practice,
bit slices are abutted
to obtain the design of an -bit datapath. The layout width of all bit
slices is identical and we call this the bit pitch or pitch. The convention
we follow for this paper is that the data flows vertically and control
flows horizontally. In most standard-cell-based datapath styles, each
bit slice is composed of multiple instances of standard cells (or larger
master cells).
I. INTRODUCTION
As we migrate toward ultra deep-submicrometer feature sizes, designs are becoming increasingly complex with very aggressive goals.
Datapaths are one of the more critical parts of the design. It is well understood that traditional design automation methodologies are not well
suited for the design of high-performance datapaths. As a result, datapath blocks are usually manually designed, resulting in a significantly
larger design time and cost.
To solve this problem, researchers are actively trying to develop design automation methodologies which are suitable for the design of
datapath circuits. For example, several datapath placement [1], [2] and
synthesis [3] techniques have been reported. In [4], the authors introduce a datapath routing methodology. Their work differs from ours in
that it uses probabilistic measures of congestion to guide the routing
which is performed simultaneously for all nets. Results are reported
on small designs, while our goal is to tackle very large industrial data-
Manuscript received April 10, 2001; revised August 2, 2001. This paper was
recommended by Guest Editor S. S. Sapatnekar.
S. Das is with the Cadence Design Systems, San Jose, CA 95134 USA
(e-mail: [email protected]).
S. P. Khatri is with the Department of Electrical and Computer Engineering,
University of Colorado, Boulder, CO 80309 USA (e-mail: [email protected]).
Publisher Item Identifier S 0278-0070(02)00100-8.
First, we read the schematic (logic) netlist of the whole block, which
consists of several instances of library cells. Currently, our tool can
handle only two levels of hierarchy. In the top level of the hierarchy, all
the connections between the instances are specified. In the lower level,
logical details of the library cells are specified.
B. Generating the Placement
Next, we place instances of the master cells of the datapath block in
a structured manner. In this paper, we do not focus on placement, since
several datapath placement algorithms are already available. Rather, we
use an industrial datapath placement tool to produce a regular placement.
C. Reading the Layout Information of Cells
In this step, we read the layout information of the library cells that
make up the datapath. In particular, we obtain details about the blockages present in the datapath block.
D. Extracting Net Clusters
In a datapath block, several regular structures are present across multiple bit slices. Techniques to extract regular instance structures have
been proposed by Arikati et al. [5] and Hassoun et al. [6].
In this paper, we extract regular net structures present in different
bit slices. We define a net cluster as a collection of nets (spread over
94
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 21, NO. 1, JANUARY 2002
different bit slices) in which all nets have similar connections. In particular, if two nets net1 and net2 belong to same net cluster, then net1
and net2 contain the same number of pins and for each pin p of net1
with coordinates (xp , yp ), there exists a pin q of net2 with coordinates
(xq , yq ) such that
1) yp = yq ;
2) jxp 0 xq j = k 1 bit pitch (1 k N 0 1).
To denote a net cluster NC 1 with nets N 0, N 1, N 2, N 3, N 4,
we use the following notation: NC 1 = fN 0; N 1; N 2; N 3; N 4g. We
have developed different algorithms to identify net clusters. The footprint-driven clustering (FDC) algorithm creates net clusters based on
the names of pins, master cells, and nets in the datapath. This is supplemented by a more powerful instance-driven clustering (IDC) algorithm, which extracts clusters based on position information of the pins
of nets. In the detailed description of these techniques, we have illustrated the techniques via examples containing two-pin nets. However,
all our clustering techniques work for multipin nets as well.
1) Footprint-Driven Clustering: In general, datapath designers
follow a very regular naming style, in order to effectively manage and
debug the datapath design. The FDC exploits this naming regularity.
Fig. 2 shows a 4-bit datapath which follows a regular naming scheme.
We define the global footprint of a net as a string which is created
by lexicographically concatenating the names of the net pins (of the
connecting instances) and names of master-cells of those instances.
The detailed footprint of a net is defined as a string that is created
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 21, NO. 1, JANUARY 2002
95
,
,
,
,
,
. Group 2 contains
nets SM [3], SM [2], SM [1], SM [0].
2) In second step, we consider one group at a time to create net
clusters. At the end of this step, we get a total of three net clusters
(two from Group 1 and one from Group 2). These are
f
f
NC 3 =f
g
g
NC 1 =
NC 2 =
f
f
NC 1 =
AB; C D; E F; GH
NC 2 =
K L; M N; RS; T V
g
g
f
f
NC 3 =f
NC 1 =
NC 2 =
SM [1]; SM [0]
ABC; DE F
Fig. 3.
Control net-clustering.
96
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 21, NO. 1, JANUARY 2002
We illustrate this algorithm by using the net RST [0] in Fig. 3. The
net is connected to pin s of instances B 3, B 2, B 1, B 0 and to two
side-pins (U and V ). We can model this long net (spread over four bit
slices) as a collection of five subnets. Then, we create a net cluster out
of these five segments. To handle the special properties of control-net
clusters, the routing strategy is modified slightly, as described in Section III-F2.
The overall net cluster determination flow is described in Algorithm 2.
Algorithm 2: Net-Cluster Determination Flow
getFootprintDrivenClusters(allNets)
fpdClusters
getUnClusteredNets(allNets, fpdRemainingNets
Clusters)
idClusters
getInstanceDrivenClusters(remainingNets)
dataClusters
getMergedClusters(fpdClusters, idClusters)
ControlNets
getUnClusteredNets(allNets, dataClusters)
getControlNetClusters(ControlcontrolClusters
Nets)
ControlClusters
allClusters
dataClusters
return allClusters
We first discuss the bit-slice selection strategy for the case when all
the net clusters are full and then we consider the case when some net
clusters are not full.
1) Selecting Representative Bit Slice When All Net Clusters Are
Full: To determine the representative bit slice, we conceptually
consider the datapath to have an infinite number of bit slices on the
left of the N th bit slice and on the right of the zeroth bit slice. In this
way, each of the N bit slices has an identical number of nets. In such
a structure, any of the N bit slices can be used as the representative
bit slice for explicit routing.
When we perform route propagation, all routes to or from bit slices
to the left of the N th bit slice or the right of the zeroth slice are disregarded, resulting in a correctly routed datapath.
2) Selecting Representative Bit Slices When Some Net Clusters Are
Not Full: If all the net clusters are not full, we need to select multiple bit slices as the representative slices. The following strategy is
used in this selection process. For simplicity, we limit our discussion
to same-bit net clusters, but same strategy can be used for cross-bit net
clusters also.
1) Calculate the number of member nets in each net cluster and then
find the number of nets present in each bit slice.
2) Find the bit slice with the maximum number of nets and then
select that slice as the representative bit slice. In case of a tie,
we select the bit slice with the largest net cluster (other than full
net clusters). If we encounter a tie in this comparison as well,
then any one of those slices may be selected. Let us consider two
bit slices which have same number of nets. Let us assume that
the largest net cluster (other than full net clusters) belonging to
the first bit slice has i member nets and the largest net cluster
(other than full net clusters) belonging to the second bit slice has
j member nets. If i > j , then we select the first bit slice as the
representative one. On the other hand, if i = j , then any one
of those bit slices can be selected (in our implementation, we
choose the bit slice having higher index).
3) After obtaining the routes for nets in the selected bit slice, we
propagate these routes to other bit slices with nets in the same net
cluster (and mark the net cluster as routed). Then, we repeat steps
2 and 3 for the unrouted nonfull net clusters. When we route the
next representative slice, we do not disturb routes that have been
generated or propagated earlier. This process continues until we
mark all net clusters as routed.
We illustrate the above technique using the design shown in Fig. 4.
There are four net clusters present in that design. Those are:
1)
2)
3)
4)
f
f
NC 3 = f
NC 4 = f
g;
NC 1 =
NC 2 =
g;
g.
RF [1]; RF [0]
g;
LF [3]; LF [0]
Notice that bit slices 3, 2, 1, and 0 have three, two, three, and three
nets, respectively. Since bit slices 3, 1, and 0 have three nets each, we
break the tie by choosing the bit slice with the largest net cluster. We
observe that bit slices 3 and 1 have one three-member net cluster and
one two-member net cluster. On the other hand, bit slice 0 has two
two-member net clusters. Thus, either bit slice 1 or bit slice 3 can be
chosen as the first representative bit slice. In our implementation, we
choose the bit slice 3. After obtaining the routes for nets in bit slice 3,
we propagate them to other bit slices with nets in the same net cluster.
Now we mark net clusters NC 1, NC 2, and NC 4 as routed. After
this, only one two-member net cluster is left unrouted in both bit slices
1 and 0. We select bit slice 1 as our next representative bit slice. Once
routing and route propagation is completed, we mark NC 3 as routed
after which no more net clusters remain to be routed.
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 21, NO. 1, JANUARY 2002
97
Selecting representative bit slices when all net clusters are not full.
=
=
=
SrtOtherNets
SortByPinYDescend(OtherNets)
if (same Y for multiple nets among SrtOtherNets)
then
SameYNets
GetSameYNets(SrtOtherNets)
ConflictNets
SortByLength(sameYNets)
SrtOtherNets
ModifyNets(ConflictNets,
SrtOtherNets)
end if
SrtOtherNets
SrtWtNets
SrtAllNets
Return SrtAllNets
=
=
If two end points of a net are (x1 , y1 ) and (x2 , y2 ), then we define a
direct route as a path which has one the following strap patterns.
Case 1) If (x1 = x2 ) AND (y1 = y2 ), only strap (vertical)
(x1 ; y1 )
(x1 ; y2 ).
Case 2) If (x1 = x2 ) AND (y1 = y2 ), only strap (horizontal)
(x1 ; y1 )
(x2 ; y1 ).
Case 3) If (x1 = x2 ) AND (y1 = y2 ):
a) vertical-then-horizontal (V T H ): first strap (vertical)
(x1 ; y1 )
(x1 ; y2 ), second strap (horizontal)
(x1 ; y2 )
(x2 ; y2 ).
b) horizontal-then-vertical (H T V ): first strap (horizontal) (x1 ; y1 )
(x2 ; y1 ), second strap (vertical)
(x2 ; y1 )
(x2 ; y2 ).
We illustrate our routing algorithm using representative bit slice of
Fig. 5. Case 1 is illustrated between pins D and C . Case 2 is shown
between pins E and F (this case occurs only for cross-bit nets). Case 3
is shown between pins A and B . Examples of case 3 are shown in path
P 5 (H T V ) and path P 6 (V T H ).
After sorting all nets with respect to the largest Y coordinate of their
pins, we note that the topmost two pins are pins V and S . We select
the net associated with pin V as our first routing candidate, since it
would have a longer vertical strap (assuming we can find a direct route
for the net associated with pin S ). We first try to find a direct route
(this minimizes the via count). We attempt both V T H and H T V direct
routes and check whether either of these routes intersect with any other
pin/blockage. In this example, path P 2 (V T H direct route) intersects
with pin G. So we choose path P 1 as the route between pins V and U .
If there was no blockage in path P 2, then we could have taken any of
those two paths as the final route.
!
6
!
!
!
!
98
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 21, NO. 1, JANUARY 2002
1) If (yH
<
(xH ; yH )
virtually instantiate the subnets of other bit slices into the representative bit slice. Virtual instantiation of a subnet implies treating the subnet
as a part of the representative bit slice while routing. After routing,
we reinstate the actual subnet route back to its original bit slice. This
method of modeling cross-bit nets saves runtime and memory and ensures regularity of the resulting design.
Fig. 6 shows four bit slices of a larger design with a forward cross
connectivity of degree two. Let us assume that our representative bit
slice is the bit slice k . Net 1 connects pin S of instance A5 (in bit slice
k ) to pin X of instance D7 (in bit slice K + 2). We assume that Net 1
belongs to a net cluster that has other member nets as well. In order to
maintain the readability of Fig. 6, only one other net (Net 2) of this net
cluster is shown.
Our aim is to route all cross-bit nets with only the data of the representative bit slice (bit slice k ) loaded in memory. The core algorithm
for finding routes is the same as that for same-bit nets with a few modifications to handle cross-bit nets. To obtain the route for Net 1, we
actually need to traverse through through bit slices (because the degree
of cross-bit connectivity is two). Therefore, whenever we try to extend
a horizontal strap to some location in the adjacent bit, we split the strap
into two straps such that each strap is confined to a single bit slice. In
the case of Net 1, we first obtain a strap from point S (in instance A5)
to point G (in A5). Next, we attempt to create a horizontal strap from
point G (in A5) to point J (in A6). In order to illustrate the mechanism
by which we model this strap using a single representative bit slice, we
do following splitting:
(xG ; yG )
!(
J ; yJ ) = (xG ; yG ) ! (xH ; yH )
+(xH ; yH ) ! (xJ ; yJ )
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 21, NO. 1, JANUARY 2002
(a)
99
(b)
the overlapping straps belong to the same net. After routing, the virtual
straps will be reinstated to their actual bit slices, solving the overlap
problem. Once we reach the virtual destination pin, then we simply
perform the reverse mapping of all virtual locations to their original locations to get the legal straps. We utilize the same virtual instantiation
approach for vertical straps such as (xJ ; yJ ) ! (xP ; yP ).
If we follow the above approach, then we may get overlaps in the
horizontal straps of cross-bit nets belonging to same net cluster. To
avoid this problem, we use the following approach: in any net cluster,
if the horizontal span between the source and destination pins is p, then
we generate p different routes for that net cluster.
Control nets are also routed using this cross-bit routing strategy.
Fig. 3 illustrates control nets that span the entire datapath block. Control-net clusters for such nets are created using the CNC technique of
Section III-D2. As a result of applying the CNC technique, (N 0 1)
intermediate cross-bit nets (with cross-connectivity of degree one) are
created in a special net cluster. The destination pins and the source pins
of these nets have the same y coordinate and their x coordinates differ
by the bit pitch. We route one of these segments (in the representative bit slice) by using the cross-bit routing technique. Since all these
(N 0 1) segments belong to the same net, we do not need to generate
multiple routes to guarantee an electrically correct design.
3) Routing Multipin Nets: Multipin nets are defined as nets which
are connected to more than two pins. In typical datapath blocks, about
30%40% of nets are multipin nets. An approach to route a k -pin net is
to split the net into (k 0 1) two-pin subnets (using a minimum spanning
tree) and then route each subnet individually. The problem with this
technique is that the routes are often nonoptimal.
To address this issue, we utilize a shifted-pin approach. This
technique can be used for both same-bit and cross-bit nets. Consider a k -pin net, which is connected to pins present at (x1 ; y1 );
(x2 ; y2 );(x3 ; y3 ); . . . ; (xk ; yk ), where y1 y2 y3 1 1 1 yk .
In the first step, we split this net into (k 0 1) two-pin subnets
S1 ; S2 ; . . . ; Sk01 , where Si is connected to pins located at (xi , yi )
and (xi+1 , yi+1 ). Since our strap-based router starts routing from the
topmost row, the subnet S1 is routed first. After routing this subnet,
the router considers S2 , which is the subnet between the pins present
at (x2 , y2 ) and (x3 , y3 ). Instead of treating the pin located at (x2 , y2 )
as a fixed pin, we virtually shift it to (x3 , y 3 ), such that:
1) (x3 , y 3 ) is a point on the route for S1 ;
2) the Manhattan distance between (x3 , y 3 ) and (x3 , y3 ) is the minimum over all the points on the route for S1 .
After obtaining the new pin location (x3 , y 3 ), we perform strapbased routing between this new pin and the pin at (x3 , y3 ). This strategy
is applied for all the (k 0 1) subnets. This technique is quite useful
(c)
(d)
=
=
100
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 21, NO. 1, JANUARY 2002
TABLE I
CHARACTERISTICS OF EXAMPLE CIRCUITS
trol-net clusters. The figure assumes that the horizontal route between
the s pins of net RST [0] of adjacent bit slices is blocked.
At this stage, we check whether there are any unrouted nets present in
the design. If some nets have not been routed, we invoke the strap-based
routing scheme to route these nets.
IV. ADVANTAGES OF OUR APPROACH
Fig. 9.
MasterRoutes
getMasterRoutes(MasterNet, p)
= MasterRoutes is an array of p routes =
MasterRoute
MasterRoutes
getNetClusterForNet(MasterNet)
NetCluster
OtherSisterNets
getSisterNets(NetCluster,
MasterNet)
for each net (SisterNet) in OtherSisterNets do
getSourceBit(SisterNet)
NewSourceBit
NewSourceBit-k N
PositiveBitDiff
ModValue
(PositiveBitDiff modulus p)
SisterRoute
ModifyRoute(MasterRoutes[Modvalue], SisterNet, MasterNet)
AssignRoute(SisterNet, SisterRoute)
end for
end for
=
=
Once we complete route propagation for all the nets present in the
representative bit slice, we obtain a design-rule correct route for all the
bit slices and the routing task is completed.
Route propagation changes slightly for cross-bit connections. As
mentioned in the previous section, if the horizontal span between the
source and destination pins of a cross-bit net is p bit slices, then we
construct p different routes for that net. Now, we need to propagate the
correct route to individual bit slices. Algorithm 5 describes our technique for route propagation of cross-bit nets.
For control nets, we propagate the routes obtained in the representative bit slice by using the cross-bit route propagation technique. Also,
since the rightmost and the leftmost segments in a control-net cluster
are topologically different, we perform explicit routing for these two
segments. In Fig. 9, we have shown the routing results for the two con-
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 21, NO. 1, JANUARY 2002
101
TABLE II
RUNTIME, WIRE-LENGTH, AND VIA-COUNT COMPARISON BETWEEN AN INDUSTRIAL ROUTER AND OUR ROUTER
about 8%. We conjecture that the our router utilizes fewer vias because
of its strap-based nature.
VI. CONCLUSION
In this paper, we have presented a new method for performing
detailed routing for datapath designs, which fully utilizes the regular
structures present in a datapath. In our technique, we first extract
interconnection regularity within the datapath by creating net clusters. Next, we route the net in a single representative bit slice of the
datapath and from the routes thus obtained and we infer routes for the
rest of the nets in the corresponding net cluster. Experimental results
demonstrate a significant improvement in runtime over a commercial
router. Also, our router produces highly predictable timing results.
REFERENCES
[1] N. Buddi, M. Chrzanowska-Jeske, and C. Saxe, Layout synthesis for
datapath designs, in Proc. Eur. Design Automation Conf., Sept. 1995,
pp. 8690.