Algorithm - Cells with Two FET Rows - Transistor-Level Layout of Integrated Circuits

5.2 Cells with Two FET Rows

5.2.1 Algorithm

Algorithm 7 gives a high-level view on the layout flow used within BONN -CELLto generate placements for cells with two transistor rows.

In the very first step, the FETs are partitioned into the groupsF_b_{—the FETs} that are placed near the GND power rail at the bottom border of the cell—and F_t—the FETs constituting the top transistor row near theV_DDpower rail. By default, those groups are chosen asF_b_:={F∈ F |t(F) =_n}_andF_t_:={F ∈ F |t(F) = p}. In real-world instances, most of the cells have a comparable amount of n- and p-FETs, and n-FETs tend to be connected to GND while p-FETs tend to be connected toV_DD, which makes this a reasonable strategy.

Section 5.2.4 describes how this default behavior can be varied in BONNCELL. Phase 0

The zeroth phase of the layout flow is designed to yield legal 2-dimensional layouts on almost all instances within a short amount of time. Although the result may be of low quality, it provides a fallback solution in the case that the subsequent phases do not find legal layouts within the runtime limit. For cells that are hard to lay out in this sense, BONNCELLthus outputs a legal result rather than nothing at all.

In line 2 the finger lengths are bounded from above, in addition to a possible user-defined or technology-defined restriction, such that the instance has the following properties: Given legal 1-dimensional layouts(x_b,C_b)of F_b and (x_t,C_t) of F_t, their union can be extended to a legal 2-dimensional layout.

Consequently, both FET rows can be processed independent of each other without any consideration of the opposing FET row.

The required property is enforced by defining vertical intervals[y^min_b ,y^max_b ] and[y^min_t ,y^max_t ]which are reserved for the placements of both rows, that is, y^min_∗ ≤ y(F) ≤ y^max_∗ −l_f(F) +1 must hold for F ∈ F_∗ _and∗ ∈ {_b,t}_{. The} intervals are chosen sufficiently pessimistic so that no design rule can be vi-olated when the FETs are placed accordingly.

By construction, every optimal canonical layout that is generated for both rows separately (line 3) can therefore be extended to a legal 2-dimensional cell layout. For both rows Algorithm 2 is used to quickly compute an up-per bound for the width of an optimal legal layout. Then, as discussed in Section 4.4.4, an optimal canonical legal layout is determined, andW_best is initialized with the previously found upper bound.

5.2. CELLS WITH TWO FET ROWS 73

Figure 5.2: Layouts of the same netlist at the end of phase 0 (the red region is reserved for wiring), phase 1 (with a maximum finger length of 5), and phase 2 of Algorithm 7.

Algorithm 7:CellPlacement Data: A netlistF

Output: A 2-dimensional layoutΛbest 1 Compute partial netlistsF =F_b∪F˙ _t

2 (Phase 0)Bound finger lengths for independent row placement

3 Compute optimal canonical legal layouts ofF_bandF_t

4 Extend the layouts to a legal 2-dimensional layoutΛ_best

5 Q_best ←Q(_Λ_best)

6 (Phase 1)Bound finger lengths tol^max_f

7 Compute optimal canonical legal layoutsΛbofF_b_and_Λ_t_ofF_t

8 Let(_Λ_big,F_big)be the larger row and(_Λ_small,F_small)the smaller row

9 W_LB←W(_Λ_big)

10 whileno legal 2-dimensional layout has been found in phase 1do

11 forall legal 1-dimensional layoutsΛbigofF_bigwith W(_Λ_big) =W_LBdo

12 Find a legal 1-dimensional layoutΛ_smallofF_smallwith

W(_Λ_small)≤W_LBsuch that(_Λ_b_,_Λ_t)can be extended to a legal 2-dimensional layoutΛofF and the quality ofΛis optimal

13 ifQ(_Λ)< Q_bestthen

14 Λbest←_Λ, Q_best ←Q(_Λ_best)

15 W_LB←W_LB+1

16 (Phase 2)Remove restrictions on finger lengths and repeat steps 7–15

17 returnΛbest

Phase 1

Although the rigid constraints on the length of the fingers are relaxed in phase 1, the FET dimensions are again constrained for this part of the flow.

Here the purpose of that restriction is not to be able to handle both FET rows independently, but to find globally optimal, or at least nearly optimal, solu-tions while at the same time the search space is reduced as much as possible.

The motivation for this approach lies in the observation that in globally op-timal 2-dimensional cell layouts very long fingers occur rarely. The value l^max_f , i.e. the maximum length of a finger during this phase of the flow, was decided upon based on the actual distribution of finger lengths in optimal layouts. For cells that have the most common size, the data given in Fig-ure 5.3 lead to the choice ofl^max_f =5. For smaller and larger cell outlines, the value scales proportionally. In comparison, phase 0 has to run withl^max_f =3 in order to avoid errors near the center of the cell.

After constraining the finger lengths, BONNCELLgenerates optimal canoni-cal layouts for both FET rows (line 7). Similar to the preceding phase those layouts are computed independent of each other, so this time these layouts

5.2. CELLS WITH TWO FET ROWS 75

50 100 150 200

235

2 201

3 113

4 123

5 22

6 16

7 2 8

6 9

0 10

Figure 5.3: Number of FETs withkfins per finger in 63 globally op-timal layouts from the CLCtest bed (cf. Section 7.1). The maximum possible finger length was set to 10.

cannot, in general, be extended to a legal 2-dimensional layout. They serve two other purposes: First, the layouts indicate what the larger transistor row is, which is renamed F_big. Second, the width of the larger row serves as a lower bound for the width of a legal 2-dimensional layout. This lower bound, W_LB, is then used to check if a legal 2-dimensional layout of that width exists.

As long as this is not the case, the lower bound is increased by 1 (line 15). The process stops when the first feasible solution has been found, i.e. whenW_LB is set to the smallest width for which a legal 2-dimensional layout exists.

The loop in line 11 is implemented by the enumeration technique described in Section 4.4.4. Each time the branching method encounters the caseF_u =

∅, i.e. all FETs in the large row have been placed without exceeding the pre-scribed widthW_LB, another routine is called that lays outF_small_.

Because the program flow leaves the top-levelwhileloop as soon as the first legal layout has been found, it is known in every iteration that no smaller layouts exist.

Lemma 10. When line 11 is executed in Algorithm 7, then no legal 2-dimensional layoutΛwith W(_Λ)<WLBexists.

Due to the fact that within this flow the quality of the layout is optimized rather than its width, an extended bounding possibility is gained: If for the already laid out left part of the FET row the gate netlength exceeds the gate netlength of the best known solution, which by the previous lemma cannot be more compact, then the branch and bound tree can be pruned. To achieve this, LowerBoundreturns ∞if NLg(_Λ_part) > NLg(_Λ_best), whereΛpart is the partial layout that has been fixed in the current step of the algorithm.

Phase 1: Small Row Pruning

When F_big has been fully laid out, including the FET’s y-coordinates, the algorithm must decide ifF_smallcan also be laid out legally. Before this is done by a fully featured exact layout algorithm, a pruning method is applied that

Figure 5.4: Only the fin/gate intersections in the green region can be covered by gates in the top FET row. During small row pruning, BONNCELLtries to prove that the area does not suffice without call-ing an exact layout algorithm.

detects in many—but not all—cases when the unused part of the cell does not suffice to place the remaining FETs. The additional amount of runtime spent in this estimation is very small compared to the runtime that would be wasted in failed attempts to lay out small rows exactly.

The idea of the small row pruning is to focus on the gates or, more precisely, on all the possible fin/gate intersections (which can be identified with the elements ofΓ:= {0, . . . ,W_LB−1} × {y_min, . . . ,y_max}). On every x-coordinate in this discrete grid a set of y-coordinates is unavailable for the placement of F_small. These blocked sets contain

• the indices of fins that are used by the layout ofF_big,

• fins that must be reserved for the wire access ofF_big,

• or fins lying betweenF_bigand the adjacent power strip.

The usable points inΓform a shape that must suffice for the layout ofF_small_. Figure 5.4 illustrates the idea. Note that the exact definition of the usable fin/gate intersections depends on a large number of design rules affecting various layers of metal. The focus on the gates makes the swap status of FETs irrelevant, so it is not considered. Every laid out FET can be seen as a rect-angle inΓ, and therefore we only need to consider, for a fixed finger length k, the realization with the fewest number of fingers that results in fingers of lengthk. For example, if a FET must have fingers of length 3 if configured with 4 to 6 fingers, then only the rectangular representation with width 4 and height 3 must be considered.

Similar to the 1-dimensional layout algorithm, a branch and bound method is employed that recursively generates an arrangement of the aforementioned

5.2. CELLS WITH TWO FET ROWS 77 simpler FET model from left to right. For a given initial sequence of these FETs, all possible rectangular shapes of all unplaced FETs are placed at the smallest possible x-coordinate, thereby extending the initial sequence by one element. The smallest possible x-coordinate is affected by two factors:

• If there are no common source/drain nets with the previous FET in the row, then diffusion sharing is impossible and a gap of size d⁺_min is introduced.

• Several design rules regarding two opposing gates on the same x-co-ordinate can be checked, and if a rule is violated the corresponding placement of the FET can be forbidden.

If the layout of a subset of the FETs requires at least d_maxtracks more space than another layout of the same subset, then no further branching must be performed on this sub-layout because the more compact solution, together with a gap of sized_max, can be substituted for it in every final solution. If no solution is found with this method, then this serves as a proof that no legal 2-dimensional layout ofF_small can be found that extends the layout ofF_big_. As a consequence, this layout can be pruned and no exact algorithm for the second transistor row must be called.

It should be noted that the small stack pruning itself is an exponential-time algorithm. Its implementation within BONNCELLensures that no significant amount of runtime is spent in this method by limiting its available runtime.

Moreover, the algorithm does not start if|F_small|is too large or if the number of fin/gate intersections covered by F_small is less than half of the available number of fin/gate intersections inΓ.

Phase 1: Small Row Placement

When the small row pruning method could not prove that the small row has no chance to be laid out within the remaining free space, an exact placement algorithm is called in line 12. This algorithm seeks a layout Λsmallof F_small with the following properties:

• W(_Λ_small)≤W_LB

• (_Λ_small_,_Λ_big)can be extended to a legal 2-dimensional layoutΛofF

• Q(_Λ)<Q_best

The implementation of the method is again derived from the method pro-posed in Section 4.4.4 to enumerate layouts whose width does not exceed W_LB, which ensures the first item on the list. The second item is satisfied by a modification of theLowerBoundfunction. Although many complicated de-sign rules have an influence on the realizability as a 2-dimensional layout, the rules can be checked locally by a call to the legality oracle. The oracle only has to examine the direct neighborhood of a FETF, including the FETs in F_smallthat were placed immediately to the left of F and the FETs inF_big

in the vertical proximity of F. If a non-legalizable situation is detected in LowerBound, it returns∞, thereby forcing subsequent backtracking.

To aid the detection of such situations efficiently, several data structures are initialized when the layout ofF_bigis fixed. Among them are arrays that con-tain, for every x-coordinate of a gate, the information how much fins are available on that coordinate, what net must be connected to the gate from the larger FET row, and which Vt level appears in the opposing row. An-other important part of the legalizability detection is a function that, given two gate contacts with the same x-coordinate, returns the minimum number of fins between the gates that must not be covered by one of the FETs, i.e. the minimum vertical distance. This is crucial because if twodifferentnets must be connected to two such gates, then a large amount of free space between the gates needs to be reserved for wires. If, on the other hand, the same net is accessed, the routing can just insert a vertical wire on the PC layer and only a small amount of free space must be left between the transistors. The exact rules affecting this function heavily depend on the technology and are not discussed in detail here.

The third item in the list above provides even more bounding opportunities than in the layout ofF_big. By Lemma 10, the width of the layout is again irrel-evant for the target function. But the netlength can be estimated much better because one half of the cell is fixed in this situation. The idea is to maintain lower bounds NL^LB_g and NL^LB of the gate netlength and total netlength of

NL^LB_g ,NL^LB

legal layouts that extend the current partial layout within the algorithm. We describe how this is achieved for the gate netlength, the total netlength is handled analogously.

When the layout ofF_big is fixed, numbersb⁻_N andb⁺_N are computed for ev-ery net that store the leftmost and rightmost coordinate of analready placed

b⁻_N,b⁺_N

contact that accessesN, i.e.b⁻_N := minTg(N)andb⁺_N := maxTg(N). If a net does not appear in the larger row, the values are initialized withb⁻_N := _∞_and b⁺_N :=−∞. During the layout ofF_small, the numbers are updated as FETs are placed and unplaced. Then∑N∈Nmax{0,b⁺_N−b⁻_N}remains a lower bound for the total netlength.

But this bound can be improved: Assume that when LowerBound is called there are stillk fingers of FETs in F_u, the unplaced FETs, which need to be connected toN. Then it can be anticipated that after all FETs have been laid out, b⁺_N is at least 2(W(F_l) +k−₁), where the factor 2 is required because the unit of netlength is half-track. Hence, ifk_Ndenotes the number of fingers connected toNinF_u(or equals−_∞if no such fingers exist), then we can set

NL^LB_g :=

∑

N∈N

max{0, max{b⁺_N, 2(W(F_l) +k_N−1)} −b⁻_N}.

NowLowerBoundis modified to return ∞if NL^LB_g is strictly larger than the gate netlength of the best known solution. If equality holds, then NL^LB,

5.2. CELLS WITH TWO FET ROWS 79 which is maintained analogously, is compared to the total netlength of the best known solution.

Phase 2

Phase 2 is identical to phase 1 with the only difference that the constraints on the finger lengths are lifted and only the technology-defined (or user-defined, cf. Section 5.4) limits are imposed. The consequence is that the algorithm has much more freedom to choose the sizes of transistors, and thus requires much more runtime in many cases. However, as a good or even optimal solution has already been found in the previous phase, this solution can be returned in case of a timeout. Figure 5.3 serves as evidence that in many cases no solutions will be found in phase 2 that have not been found in phase 1. More substantial evidence will be presented in Section 7.2.1.

Finger Enumeration

The enumeration of the finger numbers, which is part of Algorithm 6, can be improved in the context of quality optimization. Assume that for some F ∈ F we have L^F_f(k) = L^F_f(k+2) = {l}. In other words, increasing the number of fingers by 2 does not change their lengths. For a simple distance function that essentially depends on the finger lengths, which we consider in the implementation of BONNCELL, this implies that the distance require-ments betweenFand neighboring FETs do not change.

It is easy to see that no component of the quality measure can improve, and that∑F∈Fn_f(F)l_f(F)strictly degrades, when such a FET enlargement is per-formed. We can thus forbid such operations by settingL^F_f(k+2):=∅ when-ever L^F_f(k+₂) =L^F_f(k)_holds.

Following the arguments above, we can conclude that the flow generates provably optimal layouts among all layouts with one row of n-FETs and one row of p-FETs.

Lemma 11. When Algorithm 7 finishes,Λbestis a legal 2-dimensional layout ofF such thatΛbest ≤_Λfor every legal 2-dimensional layoutΛofF withF_b := {F∈ F |t(F) =n}andF_t:={F ∈ F |t(F) =p}.

Im Dokument Transistor-Level Layout of Integrated Circuits (Seite 72-79)