2012 - 3D IC Floorplanning Automating Optimization Settings and Exploring New Thermal-Aware Management Techniques
2012 - 3D IC Floorplanning Automating Optimization Settings and Exploring New Thermal-Aware Management Techniques
Microelectronics Journal
journal homepage: www.elsevier.com/locate/mejo
a r t i c l e i n f o abstract
Article history: The introduction of 3D chip architectures is an increasingly attractive integration solution due to the
Received 2 August 2011 potential performance improvement, power consumption reduction and heterogeneous integration.
Received in revised form Nevertheless, thermal distribution, evacuation and limitation constitute some of the key issues that can
3 March 2012
hinder widespread adoption of 3D integration technology. Efficient 3D floorplan algorithms have to be
Accepted 6 March 2012
Available online 30 March 2012
developed to address such complexity. In this paper we first discuss the implementation of such an
algorithm and identify parameters that play a role in the solution quality. We then propose the use of a
Keywords: genetic algorithm to discover sets of parameters that guarantee good floorplan quality. Then, we
3D–IC present an improved thermal-aware floorplanner based on a new formulation of the cost function that
Floorplanning
minimizes not only peak temperature, but also thermal gradients. The temperature minimization goal
Multi-objective optimization
is reinforced using a smart heuristic that guides 3D moves in the direction of placing power hungry
Thermal
TSV blocks next to the heat sink. Experimental results show the ability of the method to reduce the
temperature peak and gradient significantly, while maintaining area, wirelength and computation time.
& 2012 Elsevier Ltd. All rights reserved.
1. Introduction and related work cooling [9] is proposed as an alternative for these high perfor-
mances applications.
Three-dimensional integration, where multiple device layers In a 3D–IC design flow, we identify three main approaches
are vertically stacked and interconnected, is perceived as a to reduce on-chip temperature: (i) thermal-aware floorplanning,
solution to scale the performance of electronic devices beyond (ii) thermal via insertion and (iii) package and heat sink design.
Moore’s law. It provides means to drastically decrease intercon- While both (ii) and (iii) have been proved to help heat dissipation
nect length, which directly results in increased speed [1,2], and to [10–13], they require additional silicon area and external compo-
combine various technologies (digital, analog, memory, etc.) [3] nents (fan, pump, etc.), respectively. Thus, to be cost effective, the
and physical domains [4] in a single product, thereby greatly use of these latter solutions should be limited by tackling the heat
extending the capabilities of systems-on-chip (SoC). Other inter- problem during floorplanning.
esting characteristics such as low power consumption and high Thermal-aware floorplanners have already been presented by
performance are expected from 3D integration, and make this multiple authors, all of them minimizing a weighted sum of area,
solution a good candidate for a wide range of applications wirelength and peak temperature. Zhou et al. [14] use an analy-
(medical, automotive, communication, wireless, etc.) [5]. tical method, where temperature is a force repelling blocks from
However, these benefits come at a price: testing becomes clustering. Cong et al. [15] use the simulated annealing heuristic
difficult [6], yield can decrease rapidly with the number of layers (SA) to solve the minimization problem and introduce a fast
in the stack [7], while the design space grows exponentially method to compute temperature. In [16], the authors also use SA,
with the number of layers. Another major concern is the heat but suggest a two-phase algorithm where temperature is only
evacuation problem [8]. Indeed, when stacking two 100 W/cm2 minimized in the second stage.
microprocessors, the net power density becomes 200 W/cm2, In this work, we propose to address several issues to facilitate
which is beyond the heat removal capacity of currently avai- the implementation and improve the performance of thermal
lable air-cooled heat sinks. Therefore, inter-layer micro channel aware floorplanners. First, we consider the process of tuning
the floorplan algorithm. Indeed, even if some floorplanners use
analytical methods, an overwhelming number rely on the simu-
n
Corresponding author.
lated annealing (SA) heuristic to determine the best floorplan.
E-mail addresses: [email protected] (F. Frantz), [email protected], This heuristic (as with all optimization algorithms) needs to
[email protected] (L. Labrak), ian.O’[email protected] (I. O’Connor). be tuned to the specific problem to allow a fast and efficient
0026-2692/$ - see front matter & 2012 Elsevier Ltd. All rights reserved.
http://dx.doi.org/10.1016/j.mejo.2012.03.005
424 F. Frantz et al. / Microelectronics Journal 43 (2012) 423–432
BLOCK A
BLOCK A
w = 526; w = 526; Parameters meta-optimization with GA
PIN 1 PIN 1
PIN
... 2 PIN
... 2
Move selection Cost function weights Cooling schedule
Reference
benchmarks Pareto set
of parameters
Modify floorplan Evaluate Accept/Reject
% 2D moves
T
rand < e-(ΔC/T)
BLOCK A C(S) = *Area
w = 526; + *WL
PIN 1
PIN 2 +. . .
...
% 3D moves t
Design Heat sink/Package
description Floorplan algorithm
Stacked system
convergence to a global minimum. This tedious task is generally perturb the solution. This description will allow us to define the
not discussed, and most attention is devoted to the other crucial tuning parameters that must be optimized to build an efficient
aspect, the cost function. We propose techniques to improve the floorplanner using our meta-optimization approach.
floorplan quality on both aspects in the following ways:
2.1. Problem formulation
We use a multi-objective optimization based on Genetic Algo-
rithms (GA) to find the tuning parameters for a 3D–IC floor- Let B ¼{b1, b2, y, bm} be a set of rectangular blocks with height
planner. This meta-optimization is performed offline (Fig. 1) on a hi and width wi, and let T¼{t1, t2, y, tp} be the set of terminals.
limited set of benchmarks and generates the default parameters Each block has a set of pins Pi that connects it to the pins of other
to be used on all other problems. blocks and the terminals, forming nets. Let L¼{li91 rirn} be the
We propose a new cost function formulation that allows set of n layers. Let (xj,yj,lj) denote the coordinates of terminal tj
Through Silicon Via (TSV) dimensions to be taken into account. and (xi,yi,li) denote the coordinates of block bi. The 3D floor-
Indeed most other approaches consider the TSV count in the planning problem is to find a solution S for the assignment of
floorplan without including its impact on the total area. Here, blocks coordinates (xi,yi,li) so that no two blocks overlap and a
we include the area and height of the 3D-via directly into the cost function C(S) is minimized.
footprint and wirelength computation, providing means to
better evaluate the tradeoffs of using such interconnection.
2.2. Cost function
In this section we discuss the building blocks of a floorplan- Footprint Norm ¼ Footprint New =ðAreaBlocks =nÞ ð2Þ
ning algorithm based on SA and identify, for each of them, design
parameters that can affect the convergence of the optimization SumAreaNorm ¼ SumAreaNew =AreaBlocks ð3Þ
and the quality of the floorplan. First we recall the general formu-
WLNorm ¼ WLNew =WLOld ð4Þ
lation of a 3D floorplan problem. Then we discuss the writing
of the cost function, the choice for a floorplan representation, SumArea and Footprint goals are normalized by absolute
the main design choices for the SA heuristic and finally how to values. WL has to be normalized by a relative value since we do
F. Frantz et al. / Microelectronics Journal 43 (2012) 423–432 425
not know the minimal WL that can be achieved and at what area temperature update function is written as:
expense. Tk
Some authors have also proposed the minimization of the number Tk þ 1 ¼ ð5Þ
ð1þ T k logð1 þ lÞ=3sk Þ
of TSVs by adding a factor xnnbvias to the cost function [14,15]. These
approaches only minimize the via count without considering the TSV where sk is the standard deviation of the cost values observed
dimensions. It is however clear that the relative size of the TSVs is the during the kth loop of the algorithm, and l is a tuning parameter.
factor that will limit their number: for a given design, one can insert a The number of iterations at each temperature floor (where the
larger number of high density TSVs than medium density TSVs. temperature is held constant during a given algorithm loop) is
Therefore, in contrast with the previous approaches, we propose to directly related to the cooling schedule, and must be explicitly
account for the effects of these interconnects directly in the three defined, as we discuss next.
objectives of (1). The height of the TSVs is added to the total Moves per temperature floor (innerIter): Using a large number
wirelength, and blocks are surrounded by a guard ring to account of moves at the same temperature floor does not help conver-
for the TSV area overhead. gence, while using a low number can present a noisy sk and affect
When formulating a cost function using a weighted sum, the the cooling schedule. In our implementation, innerIter is linked to
most important and difficult task is to determine the appropriate the number of blocks (nBlocks) by the parameter k:
weights that will guide the optimization process in order to find a innerIter ¼ knnBlocks: ð6Þ
solution reflecting the expected tradeoff. The relative importance
of the objectives is not generally known until the system’s best Initial temperature: a suitable initial temperature T0 is one
capabilities are determined and tradeoffs between the objectives that results in a high probability w0 of accepting solutions that
are fully understood. As the number of objectives increases, trade- increase C(S), allowing a large exploration of the neighborhood of
offs are likely to become complex and less easily quantified. S0 in the early iterations. Some authors suggest this probability
The designer must therefore rely on intuition and ability to to be around 0.8 [19]. It is clear that T0 will depend on the scale
express preferences throughout the optimization cycle. The meta- of C(S) and, hence, be tied to the magnitude of the weights in the
optimization approach removes the uncertainty introduced by cost function and to the circuit under floorplan. To eliminate this
this human intervention, by considering the weights of the cost dependency, it is possible to estimate T0 in a first approach. We do
function as optimization variables. this by accepting all solutions that increase C and calculating the
þ
average cost increase DCAvg . Then T0 is given by:
2.3. 3D–SP representation þ
T 0 ¼ DC Avg =lnðw0 Þ ð7Þ
þ
The sequence pair (SP) is a general floorplan representation In our implementation, DCAvg is measured during the first
proposed by [17]. It consists of two permutation lists of n innerIter moves.
elements, where n is the number of blocks. Topological informa- In the described implementation, only two parameters control
tion is encoded by the order in which the blocks appear in the two the annealing schedule: the l parameter in the Aart’s schedule
lists, for example; and the k multiplier that defines the number of moves performed
at each temperature floor.
(/y, bi, y, bj, yS, /y, bi, y, bj, yS) - bi is left of bj
(/y, bi, y, bj, yS, /y, bj, y, bi, yS) - bi is above bj 2.5. Solution perturbation
To represent 3D floorplans, one SP is used for each layer. The The core of a floorplanning algorithm is the manner in which it
blocks in a layer’s SP are placed on that layer using the SP packing modifies the solution and improves it over time. The simplest and
method to compute their (x), (y) coordinates. most robust way to perturb the solution is to randomly permute
In [18] the authors compare SP to Transitive Closure Graph (TCG) the blocks position. However this can be time consuming. There-
and Bn-Tree, which are other popular representations. SP and TCG fore efficient algorithms make use of more sophisticated moves to
capture the same set of floorplans and have a redundant design guide the search.
space in (n!)2. Their original packing method takes O(n2), but SP is In [20], the authors present a series of heuristic moves that can
easier to implement. Bn-Tree can be evaluated in amortized O(1), but be applied to 2D SP to achieve good results for area and wirelength
captures only compacted floorplans (i.e.,: a floorplan where no block minimization with limited runtime. Among the important contribu-
can move without overlap or change in the outline): a property that tions are (i) WL minimization moves and (ii) the notion of slack in a
can exclude some (or all) of the interconnect-optimal packings. floorplan:
Also, the redundancy in the SP/TCG design space, often seen as a A WL minimization move (Fig. 2(a)) corresponds to moving a
limitation, can in fact make local search more successful by block bi close to the ‘‘ideal’’ location that would minimize the
increasing the number of paths to the global optima.
Probability Table 1
Meta-optimization variables.
iter
itersat itermax
algorithm (GA). This second problem is solved prior to the release
Fig. 3. Probability of trying a 3D move action as a function of the number of of the algorithm to the end user. It searches a set of parameters
iterations. that are able to minimize the footprint and wirelength of a
reduced, but representative, set of floorplan problems (training
problems). This method gives a means to automate the floor-
wirelength of its incident nets. This ‘‘ideal’’ location (xa,ya) is simply planner algorithm tuning, thus avoiding a tedious and knowledge
the average of the position of all modules connected to bi weighted based process. In this section we recall the features of GA and
by the net degree (number of connected pins). Once a block bj, then detail how the optimization problem was set up.
sitting close to (xa,ya) is identified, bi is moved next to bj in the SP.
The notion of slack in a floorplan is analog to that in Static 3.1. Multi-objective GA and Pareto optimality
Timing Analysis (STA). It is the distance that a block can be moved
in a given direction without changing the floorplan outline. As in GA is a stochastic global search method that mimics natural
STA, there is also the notion of a critical path, which constrains evolution. It acts over a population of potential solutions, applying
the floorplan dimensions (Fig. 2(b)). A slack move then consists of intensification (crossover) and diversification (mutation) operators
removing a block in the critical path and inserting it in a position to explore the problem space. The fittest individuals are selected and
with large slack. are used to generate a new population, in the hope of improving the
Our implementation is based on Parquet-4.5 [20]. We have solution quality.
extended its original SP representation to 3D and defined Since GA manages a population, it accepts the cost function to be
new move actions using the notions above. We can perturb the a vector of objectives, instead of a weighted sum. The GA will then
solution in the following ways: return a set of Pareto-optimal solutions (Pareto front), instead of a
single solution. A solution S is Pareto-optimal if there exists no other
3D move action set: S0 that is superior to S in terms of all objectives. As opposed to the
1 Swap blocks from two layers weighted sum formulation, the vectorized form does not require
2 Move block to another layer weighting or normalizing of the objectives. Therefore, no prior
3 Slack move between two layers knowledge about the problem is required.
4 WL minimization move between two layers.
2D move action set:
3.2. Problem setup
5 Random permutation of SP
6 Change orientation
The 3D floorplan problem is an extension of the 2D problem. A
7 Same as (3), but on a single layer
floorplanner must already demonstrate good performance in 2D
8 Same as (4), but on a single layer
cases to ensure good results with 3D cases. Therefore, we have
9 Same as (7), but with orientation change to better fill
divided the meta-optimization into two phases. The first phase of
the slack.
the optimization only takes into account 2D related parameters
(Table 1) and executes the training problems for the planar case.
The difficulty here is how to define with what probability
The optimization continues in the second phase running the
each move action should be applied. Moreover, it is known that
training problems for a 3D case and acting on the remaining
3D moves cause the greatest perturbations in the floorplan and,
variables. In this approach, the first step guarantees a good usage
when using the SA heuristic, they are increasingly unlikely to
of the available intra-layer moves (2D), while the second one
be accepted as the algorithm progresses [15]. Therefore we
adapts the frequency and the manner in which blocks are
reduce the probability of calling the set of 3D move actions with
exchanged between layers. The number of variables in each phase
a Piecewise Linear (PWL) function of the maximum number of
is 10 and 7, respectively.
moves Itermax (a stop criteria of SA), as depicted in Fig. 3.
The proposed meta-optimization approach provides a good
Finding a good distribution of the move actions in the 2D and
way to automate the simulated annealing algorithm setting,
3D sets (five and four variables, respectively) and adjusting the
involved in almost all the floorplanners. Moreover, splitting the
shape of the PWL function (three variables) plays a major role in
meta-optimization process into multiple steps allows the number
the efficiency and robustness of the optimization, and is conse-
of optimization variables to be limited (to keep the problem
quently part of our meta-optimization problem.
tractable) and enables the incremental addition of more features:
2D and 3D floorplanning and, as will be discussed in the next
sections, thermal-aware features.
3. Meta-optimization based floorplanner tuning
The previous section identified the tuning parameters of each 4. Temperature minimization background
constituent block of our SA floorplanner algorithm. The task of
assigning values to these parameters has been formulated as a In this section, we recall the main concepts involved in a
second optimization problem, to which we have applied a genetic thermal aware floorplanner and highlight some practical issues.
F. Frantz et al. / Microelectronics Journal 43 (2012) 423–432 427
4.2. The choice of a thermal model [16], using a two-phase algorithm. As described in the next
section, this approach has been adapted in our implementation
During the floorplanning process, millions of iterations are with different characteristics to improve its efficiency.
needed until the search converges to thermally efficient solutions.
With such a high number of solutions to evaluate, it is not feasible to
run a detailed finite element simulation to evaluate the thermal 5. Proposed methodology for thermal floorplaning
profile of each one. Several authors have thus proposed the use of
simplified thermal models that are suitable for use in a floor- In Section IV we have shown that using temperature measure-
planning algorithm. We have identified two models that represent ments on sparse floorplans can disturb the search. Thus, explicit
different degrees of accuracy. Both rely on the thermal–electrical temperature minimization should be left to a stage where the
analogy and use cubes to mesh the chip volume. The temperature area is sufficiently compact. In this section, we present a two-
values are obtained solving the linear system T¼PnRth, where Rth is phase algorithm, where temperature is minimized implicitly
the thermal resistivity matrix and P is the power vector. during the first phase and then explicitly during the second. We
Among the identified models, HotSpot [21] is the most refined. also define the criteria based on area to switch from phase one to
It solves the linear system with an iterative multi-grid method, two and a heuristic that will help the search algorithm converge
starting with a coarse mesh and then successively refining the to cooler solutions.
solution. HotSpot also models the heat flow from the chip to the Our two-phase algorithm differs from that in [16] in the
heat sink and the board. following aspects: first, we switch from phase one to phase two
A faster alternative was used in [15]. Instead of solving the without restarting the annealing schedule. Second, we are able to
complete linear system, the lateral heat flow is neglected and reduce temperature in the first phase. Third, our cost function for
tile stacks are analyzed individually (Fig. 4). Because there is no phase two keeps all the objectives of phase one, in contrast with
interaction between tile stacks, this approach is less accurate [16] in which the wirelength objective is not considered when
and can produce a noisy thermal profile. Nevertheless, in [16] it is optimizing temperature.
shown that the correlation between this model and HotSpot
is 0.82, making it a reasonable choice for floorplanning. In our 5.1. Phase 1 – a thermally efficient power density distribution
implementation, we use both approaches, neglecting lateral heat
flow during the optimization and using Hotspot with a fine mesh From a 1D approximation of the vertical heat flow on
to evaluate the final solution. the chip (Fig. 5(a)) it can be noticed that a power distribution
with a pyramidal shape (Fig. 5(b)) will implicitly reduce peak
temperature.
4.3. The impact of the thermal profile on the search algorithm
Therefore, during the first phase, we seek to arrange the blocks
in n layers so that the power density is maximized and the more
Floorplanning algorithms are usually initialized randomly.
power-hungry blocks are placed closer to the heat sink. For this
Random initializations generally produce disorganized (sparse)
purpose the cost function is written as:
floorplans. This largely favors the Tmax objective and can impede
the search algorithm to move to solutions of smaller area and CðSÞ ¼ anArea þ bnWL þ dnð1=P Dens Þ ð9Þ
wirelength.
where PDens is a weighted sum of the power density of each layer:
Moreover, the thermal conductivity of the bonding interface
material (epoxy, 0.05 W/mK) is much lower than that of silicon Xn
Pi
PDens ¼ Uq , ð10Þ
(150 W/mK) and copper (285 W/mK). This large difference creates
i¼1
Area i
a barrier to heat flow, leading to significant temperature increases
at each bonding interface. Consequently, it disturbs the search with
and impedes the efficient use of the upper layers. As an example, X
i
in [14], [15], where the formulation (8) has been used, the qi ¼ R1 = Rj ð11Þ
temperature reduction comes at the expense of area increase, of j¼1
the order of 16%. An alternative to this problem is proposed in and the weighting factors qi decrease for higher layers (away from
the heat sink). In such a formulation, the term PDens is maximized
1
For the sake of clarity, the terms footprint and sumArea are noted as a single both when the area is reduced and when the power-hungry
term Area. blocks are moved to the lower layers. Hence, it provides a means
428 F. Frantz et al. / Microelectronics Journal 43 (2012) 423–432
1.02
N100 Wirelength
1.01
0.99
0.95
AMI49 Wirelength
1.05
1.1
1.15 1.1 1.05 1 0.95 0.9 0.9 0.95 1 1.05 1.1 1.15
N100 Area AMI49 Area
Fig. 8. Four-Dimensional Pareto front obtained with GA. Values are normalized by the average of each objective. The dashed rectangle indicates the selected tradeoff.
Cost function weights Distribution of 3D moves optimization proved to be successful in parameterizing the floor-
planner. The performance in a 3D scenario is compared to [14]
α SlackOrient
and [15] for a four layer stack. To enable comparison with these
β works, the objective of minimizing the number of vias is intro-
WL duced in our cost function and its weighting factor is manually set
γ 3D
to obtain similar via count. Table 3 shows that our implementa-
0 0.1 0.2 0.3 tion significantly reduces the white space compared to both CBA
Move and 3D–STAF. The wirelength is, in average, 5% longer than 3D–
Swap
Probability of a 3D move STAF, but still 8% smaller than CBA. We underline however that,
100 Distribution of 2D moves for a four layer stack, any solution with white space larger than
Probability (%)
75 WL
2D
30% presents little practical interest, since the same area could be
achieved with only three layers. Therefore, the ability of our
50
floorplanner to find the most compact solutions, regardless of the
25 number of blocks in the benchmark, demonstrates the power of
0 Slack this approach.
0 25 50 75 100 Random
Slack
Iterations (% of maxIter) Orient Orient
6.1.3. Accounting for TSV dimensions
Fig. 9. Visual representation of an optimized parameter set. In this section we show the improvement that can be achieved
if TSV dimensions are handled directly in the cost function. Here
we consider the two following approaches:
Table 2 A posteriori insertion: The floorplan is obtained minimizing
Performance comparison for 2D test cases. the TSV count (as in Table 3). The solution is then modified to
reserve area for the TSVs and their height is added to the total
Circuit No. Iter. Parquet-4.5 (SP) This work
wirelength. Fig. 10(a) and (b) illustrate this approach.
Area (WS) Wire Area (WS) Wire A priori insertion: The cost function includes the TSV con-
( 103) (mm2) (mm) (mm2) (mm) tribution to area and wirelength during the entire optimization
process. Fig. 10(c) shows that this approach can make better
ami33 66 1.32 (14.1%) 59,983 1.31 (13.4%) 53,802
usage of the available area across the layers.
ami49 98 40.6 (14.5%) 800,884 40.89 (15.4%) 689,881
n100 200 0.197 (9.75%) 234,346 0.196 (9.14%) 233,666 To compare these approaches, a stack with all four layers in
n200 400 0.195 (11.0%) 441,127 0.193 (9.46%) 442,237 face-to-back (F2B) orientation and terminals on top is considered.
n300 600 0.304 (11.3%) 623,822 0.300 (9.94%) 621,030 Dimensions for TSVs are taken from ITRS projections [23] and two
cases are considered: for smaller benchmarks (n100, n200, n300)
TSVs have pitch of 4 4 mm2 and height of 10 mm, dimensions
compatible with W2W bonding. For ami33 and ami49, larger
compares its 2D performance to Parquet for an equal number of TSVs compatible with D2D and D2W are used: pitch 8 8 mm2
iterations.2 Our implementation achieves similar wirelength and and height 20 mm.
a slight reduction in white space (WS). The proposed meta- Results are presented in Table 4.3 For each metric the first
column is for a posteriori via insertion and the second for the
proposed approach. The wirelength metric includes the TSV
2
Parquet is run using Sequence Pair and default weights. Terminals are placed
3
at the boundary of the chip as defined in the benchmarks available at http:// The first column of each metric corresponds to adding the TSV overhead to
vlsicad.eecs.umich.edu/BK. Results for Parquet are also averaged over 100 runs. the results from table 3, while the second corresponds to the proposed approach.
430 F. Frantz et al. / Microelectronics Journal 43 (2012) 423–432
Table 3
Footprint, wirelength and 3D-vias minimization for four layer stack.
Area (WS) Wire Vias Time Area (WS) Wire Vias Time Area (WS) Wire Vias Time
(mm2) (mm) (count) (s) (mm2) (mm) (count) (s) (mm2) (mm) (count) (s)
ami33 0.364 (25.9%) 26.1 116 2 0.379 (31.1%) 22 122 52 0.353 (22.1%) 22.5 93 23
ami49 11.48 (29.6%) 427.2 194 5 13.49 (52.2%) 437.5 227 57 14.90 (68.2%) 446.8 179 86
n100 0.050 (11.5%) 92.7 884 20 0.059 (31.5%) 91.3 828 68 0.053 (17.9%) 100.5 955 313
n200 0.048 (10.1%) 173.9 1810 84 0.059 (34.3%) 168.6 1729 397 0.058 (31.4%) 210.3 2093 1994
n300 0.075 (10.3%) 257.2 1914 160 0.097 (42.0%) 237.9 1554 392 0.089 (30.3%) 315 2326 3480
To allow comparison, terminals are placed at the center of the top-most layer. Results for this work are averaged over 100 independent runs. The data for CBA and 3D-STAF
is presented as in the original publications for the area–wirelength minimization problem. The runtime data is not directly comparable.
Fig. 10. (a) Footprint ¼ 65,475 mm2, WL ¼195,782 mm, vias ¼1218. (b) Footprint ¼76,012 mm2, WL ¼ 219,907 mm, vias ¼1218. (c) Footprint ¼73,647 mm2, WL ¼ 215,570 mm,
vias ¼1244. Three-layer floorplan results for n300 with various strategies. (a) Minimizing via count and then (b) reserving area for vias or (c) accounting for via dimensions
during the whole optimization process. Here solution (c) uses more vias but has about 3% less area and wirelength than (b).
the dimensions of the smallest block. The best reported solution is Its combination with the two-phase algorithm was able to further
evaluated with a finer resolution (smaller grid size) using HotSpot reduce temperature ( 37%) and gradients ( 56%), but compro-
with its default heat sink model. The obtained temperature values mising area with an average 11% increase.
are tightly related to the stack properties (materials and dimen-
sions) and to the power values assigned to the blocks. Since
related works [14–16] do not detail these aspects, direct compar- 7. Conclusion
ison is not possible.
Table 5 presents the temperature for results obtained con- The performance of optimization algorithms always depends
sidering only area and wirelength minimization. They serve as a on its tuning parameters and on the cost function formulation.
reference to quantify the impact of our proposed methodology. In this work, we focus on the 3D–IC floorplanning optimization
In comparison to Table 3, here we have let the algorithm run for problem. We discuss the algorithm implementation and identify 17
a longer time in order to achieve even more compact floorplans parameters that play a role in its performance. The tedious task of
and exacerbate the temperature effect. From this table, one can discovering sets of parameters that can drive the floorplanner to
notice that the maximal gradients for ami33 and, more sig- good results is automated using a Genetic Algorithm in a meta-
nificantly, ami49, are very high. This can be explained by the optimization loop. The approach is applied to our implementation
fact that, for these small benchmarks, it is difficult to achieve a of a 3D floorplanner, but can be easily adapted to others. We also
solution over four layers with low whitespace. In addition, the propose a problem formulation that takes into account the overhead
packing methods used in most of the floorplanners system- of 3D-vias during the entire floorplan process in an improved way.
atically compact the design towards one corner (lower left), Furthermore, two methods to reduce peak temperature and
accumulating the whitespace on the opposite corner (upper thermal gradients in 3D ICs are discussed. On one hand, we
right) of each layer. propose to use a smart heuristic that favours high power density
To underline the impact of each of our propositions, we close to the heat sink. On the other hand, we use a two-phase
perform three tests (Table 6): (i) the heuristic applied to area– simulated annealing process which uses two different cost func-
wirelength minimization (weights d and g set to zero), (ii) the tions. In the first phase we combine the concurrent objectives of
proposed two-phase algorithm with the proposed cost functions area and temperature minimization in a single term. During the
(weights d and g determined by meta-optimization) and (iii) the second phase, the cost function formulation is augmented with a
two-phase algorithm combined with the heuristic. Case (i) shows temperature term that gives means to distribute blocks to mini-
that the simple use of the heuristic can considerably reduce mize not only the maximal temperature, but also the gradients
temperature, even though temperature is not among the explicit inside the layers. The results obtained show that combining the
objectives of the cost function. In case (ii), the explicit goal of smart heuristic with the cost function reduces the average peak
temperature reduction improves heat reduction but forces the temperature and gradient by 37% and 56% respectively, while
solution to occupy a larger area. Finally, in case (iii), the combina- limiting area increase to 11%, and that of wirelength to 14%. Our
tion of both techniques yields better results, further reducing the approach shows that, with a smarter problem formulation, one
temperature and gradients with smaller area than that in case (ii). can efficiently explore the margins left for temperature reduction
Additionally, the use of the heuristic in case (iii) delayed the during floorplanning. Algorithms that perform thermal via inser-
transition to the second phase, reducing the number of thermal tion can take advantage of this formulation to further improve the
simulations and leading to shorter runtime. Experimental results quality of the solution. Moreover, the approach has been for-
have shown that the use of the aforementioned heuristic is very mulated in such a way as to render fairly straightforward the
efficient, capable of reducing temperature by 25% and gradients inclusion of additional objectives in the floorplanning problem,
by 47% with little area ( þ2%) and wirelength ( þ5%) increase. such as fabrication cost.
Table 5
Reference values from area–wirelength minimization.
Table 6
Impact of the proposed techniques on temperature and gradient reduction. Results are relative to Table 5.
Circuit Area–wirelength with heuristic Proposed algorithm Proposed algorithm with heuristic
Area Wire Temp Grad Runtime Area Wire Temp Grad Runtime Area Wire Temp Grad Runtime
ami33 1.04 1.01 0.72 0.79 1.08 1.09 1.08 0.71 0.67 1.67 1.07 1.06 0.63 0.61 1.41
ami49 1.03 1.03 0.73 0.49 1.07 1.20 1.15 0.64 0.47 1.53 1.13 1.14 0.62 0.37 1.36
n100 1.01 1.06 0.77 0.49 1.09 1.12 1.13 0.67 0.48 2.03 1.12 1.16 0.65 0.41 1.95
n200 1.01 1.09 0.76 0.48 1.09 1.15 1.16 0.65 0.42 1.30 1.13 1.18 0.63 0.37 1.25
n300 1.01 1.08 0.75 0.42 1.09 1.13 1.12 0.66 0.50 1.46 1.12 1.15 0.63 0.42 1.37
Average 1.02 1.05 0.75 0.53 1.08 1.14 1.13 0.66 0.51 1.60 1.11 1.14 0.63 0.44 1.47
432 F. Frantz et al. / Microelectronics Journal 43 (2012) 423–432
References [11] B. Goplen, S. Sapatnekar, Placement of thermal vias in 3D ICs using various
thermal objectives, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 25
(4) (2006) 692–709.
[1] J.W. Joyner, R. Venkatesan, P. Zarkesh-Ha, J.A. Davis, J.D. Meindl, Impact of [12] Y. Ma, D. Chong, C. Wang, and A. Sun, Development of ball grid array
three-dimensional architectures on interconnects in gigascale integration, packages with improved thermal performance, in Proc. EPTC’05, Vol. 2, dec.
IEEE Trans. Very Large Scale Integr. VLSI Syst. 9 (6) (2001) 922–928. 2005, p. 6 pp.
[2] A. Rahman, R. Reif, System-level performance evaluation of three-dimen- [13] B. Agostini, M. Fabbri, J.E. Park, L. Wojtan, J.R. Thome, et al., State-of-the-art of
sional integrated circuits,, IEEE Trans. Very Large Scale Integr. VLSI Syst. 8 (6) high heat flux cooling technologies, Heat Transfer Eng. 28 (4) (2007) 258–281.
(2000) 671–678. [14] P. Zhou, Y. Ma, Z. Li, R.P. Dick, L. Shang, H. Zhou, X. Hong, Q. Zhou, 3D-STAF:
[3] W. Davis, E. Oh, A. Sule, P. Franzon, Application exploration for 3-D integrated scalable temperature and leakage aware floorplanning for three-dimensional
circuits: TCAM, FIFO, and FFT case studies,, IEEE Trans. Very Large Scale integrated circuits, in: G.G.E. ICCAD, Gielen (Eds.), IEEE, 2007, pp. 590–597.
Integr. VLSI Syst. 17 (4) (2009) 496–506. [15] J. Cong, J. Wei, and Y. Zhang, A thermal-driven floorplanning algorithm for 3D
[4] B. Aull, J. Burns, C. Chen, B. Felton, H. Hanson, C. Keast, J. Knecht, A. Loomis, ICs, in ICCAD. IEEE Computer Society/ACM, 2004, pp. 306–313.
M. Renzi, A. Soares, V. Suntharalingam, K. Warner, D. Wolfson, D. Yost, and [16] L. Xiao, S. Sinha, J. Xu, and E. Young, Fixed-outline thermal-aware 3D
D. Young, Laser radar imager based on 3d integration of Geiger-mode avalanche floorplanning, in ASP-DAC, Jan. 2010, pp. 561–567.
photodiodes with two soi timing circuit layers, in ISSCC, 2006, pp. 1179–1188. [17] H. Murata, K. Fujiyoshi, S. Nakatake, Y. Kajitani, VLSI module placement
[5] V. Pavlidis, E. Friedman, Interconnect-based design methodologies for three- based on rectangle-packing by the sequence-pair, IEEE Trans. Comput. Aided
dimensional integrated circuits, Proc. IEEE 97 (1) (2009) 123–140. Des. Integr. Circuits Syst. 15 (12) (1996) 1518–1524.
[6] E.J. Marinissen, Testing TSVv-based three-dimensional stacked ICs, in DATE, [18] H.H. Chan, S.N. Adya, and I.L. Markov, Are floorplan representations impor-
2010, pp. 1689–1694. tant in digital design, in In Proc. ISPD’05. ACM, 2005, pp. 129–136.
[7] E.-K. Kim, J. Sung, Yield challenges in wafer stacking technology, Microelec- [19] H. Cohn, M. Fielding, Simulated annealing: searching for an optimal tem-
perature schedule, SIAM J. Optim. (1999) 779–802.
tron. Reliab. 48 (2008) 112–1105.
[20] S. Adya, I. Markov, Fixed-outline floorplanning: enabling hierarchical design,
[8] W. Huang, M. Stan, S. Gurumurthi, R. Ribando, and K. Skadron, Interaction of
IEEE Trans. Very Large Scale Integr. VLSI Syst. 11 (6) (2003) 1120–1135.
scaling trends in processor architecture and cooling,in SEMI-THERM, feb.
[21] K. Skadron, M. Stan, W. Huang, S. Velusamy, K. Sankaranarayanan, and D.
2010, pp. 198–204.
Tarjan, Temperature-aware microarchitecture, in Proc. 30th International
[9] M. Sridhar, A. Raj, A. Vincenzi, M. Ruggiero, T. Brunschwiler, and D.Atienza
Symposium on Computer Architecture, June 2003, pp. 2–13, http://lava.cs.
Alonso, 3d-ICE: Fast compact transient thermal modelling for 3D-ICs with virginia.edu/HotSpot/documentation.htm.
inter-tier liquid cooling, in Proceedings of the 2010 International Conference [22] MATLAB genetic algorithm toolbox – user’s guide.
on Computer-Aided Design (ICCAD 2010), vol. 1, no. 1. New York: ACM and [23] International Technology Roadmap for Semiconductors 2009.
IEEE Press, 2010, pp. 1–8. [24] P. Wilkerson, M. Furmanczyk, and M. Turowski, Compact thermal modeling
[10] H. Yu, Y. Shi, L. He, T. Karnik, in: W. Islped, M.R. Nebel, A. Stan, J. Raghunathan, analysis for 3D integrated circuits, 11th International Conference Mixed
Henkel, D. Marculescu (Eds.), Thermal Via Allocation for 3D ICs Considering Design of Integrated Circuits and Systems, June 2004.
Temporally and Spatially Variant Thermal Power, ACM, 2006, pp. 156–161. [25] [Online]. Available: http://cadlab.cs.ucla.edu/three_d/3dic.html.