# **CMOS Logic Design with Independent-gate FinFETs**

Anish Muttreja, Niket Agarwal and Niraj K. Jha Dept. of Electrical Engineering, Princeton University, Princeton, NJ 08544 {muttreja, niketa, jha}@princeton.edu

## Abstract

Fin-type field-effect transistors (FinFETs) are promising substitutes for bulk CMOS in nano-scale circuits. In this paper, it is observed that in spite of improved device characteristics, high active leakage may remain a problem for FinFET logic circuits. Leakage is found to contribute 31.3% of total power consumption in power-optimized FinFET logic circuits. Various Fin-FET logic design styles, based on independent control of FinFET gates, are studied. A new low-leakage logic style is presented. Leakage (total) power savings of 64.7% (14.5%) under tight delay constraints and 91.2% (37.2%) under relaxed delay constraints, through the judicious use of FinFET logic styles, are demonstrated.

# 1 Introduction

Steady miniaturization of transistors with each new generation of bulk CMOS technology has yielded continual improvement in the performance of digital circuits. The scaling of bulk CMOS, however, faces significant challenges in the future due to fundamental material and process technology limits [1]. According to the 2005 International Technology Roadmap for Semiconductors (ITRS) [2], primary obstacles to the scaling of bulk CMOS to sub-32nm gate lengths include shortchannel effects, sub-threshold leakage, gate-dielectric leakage and device-to-device variations. It is expected that the use of double-gate field-effect transistors (DG-FETs), which provide better control of short-channel effects, lower leakage and better yield in aggressively scaled CMOS processes, will be required to overcome these obstacles to scaling [2,3].

In addition to better scalability, the use of independently-driven DG-FETs (IDDG-FETs) also allows creative construction of circuit modules [4–6]. The back gate of a DG-FET can be used in various connected configurations. Commonly, the back gate is shorted to the front gate to improve drive strength and control of the channel. In another configuration, voltage bias on the back gate can be used to modulate the front-gate's threshold voltage [7].

The goal of this paper is to explore FinFET logic design styles and study their implications for low-power design. FinFETs have been shown to provide much lower sub-threshold leakage currents than bulk CMOS transistors at the same gate length in studies at the device [8] as well as logic [9] levels. It was estimated that active-mode leakage power might account for as much as 40% of the total power consumption in CMOS circuits at the 70nm technology node [10]. In this paper, we argue that while a move to the widespread use of Fin-FETs will somewhat mitigate this problem, active leakage in high-performance and even low-power FinFET circuits will remain a problem. Our results indicate that on an average, 31% of the total active power consumption in delay-constrained 32nm FinFET circuits can be attributed to leakage power consumption. All FinFETs in these circuits were driven in the connected-gate configuration. The above power estimate was obtained using circuits from the ISCAS'85 benchmark suite which had been sized using a highly-efficient gate sizing algorithm, based on the algorithm presented in [11], to obtain a *power-optimized* configuration under tight delay constraints. Even under more relaxed delay constraints, leakage power consumption was observed to remain around 30% of the total power on an average.

We explore methods to efficiently overcome this challenge through a combination of circuit design techniques and logic-level optimization. We consider the use of IDDG-FETs in digital CMOS design, focusing on the use of independent-gate FinFETs such as those presented in [12, 13]. We describe designs made up of various FinFET logic structures where the back gate is used in different connected configurations, viz. connected to the front gate, reverse-biased to control leakage, and tied to a signal input to obtain a single-transistor switch controlled by two signals in an OR configuration. While some of these configurations have been proposed earlier, to our knowledge, this is the first work to present a comprehensive study of all designs. We also present a novel hybrid logic design style that combines the use of differently-connected FinFETs to provide a low-leakage logic structure with well-balanced rise and fall delays and demonstrate its utility in extensive synthesis experiments.

The remainder of this paper is organized as follows. Related work is reviewed in Section 2. In Section 3, Fin-

Acknowledgments: This work was supported in part by SRC under contract No. 2007-HJ-1602.

FET device characteristics are considered (Section 3.1) followed by the design of various FinFET logic structures (Section 3.2). The utility of the various FinFET design styles studied in this paper is evaluated by performing synthesis, followed by power optimization, with an efficient linear programming based standard cell selection<sup>1</sup> method. The evaluation methodology and our cell selection algorithm are described in Section 4. Evaluation results are presented in Section 5 and conclusions in Section 6.

# 2 Related Work

Double-gate devices have been used in a variety of innovative ways in digital and analog circuit designs. It is not possible here to do justice to the complete body of research in circuit design with FinFETs and other DG-FETs. This section, therefore, only attempts to review the most directly related research in digital logic design with FinFETs and other IDDG-FETs. In the context of digital logic design, the ability to independently control the two gates of a DG-FET has been utilized chiefly in two ways: by merging pairs of parallel transistors to reduce circuit area and capacitance, and through the use of a back-gate voltage bias to modulate transistor threshold voltage. A parallel transistor pair consists of two transistors with their source and drain terminals tied together. Merging transistors was shown to reduce parasitic capacitances in static- and dynamic-logic gates in [14] and [15], respectively.

Threshold voltage control through capacitive coupling between the two gates of a transistor is in fact a central advantage of sufficiently thin DG-FET structures. Threshold voltage at each gate varies linearly, over a wide range of operation [16], in response to the variation of the voltage applied at the other gate. In [4, 17–19], various circuits employing back-gate voltage bias to control sub-threshold leakage were presented.

Since a transistor's threshold voltage affects both its power consumption and delay, assignment of DG-FET back-gate bias must be considered during technology mapping or along with a gate-sizing step for a technology-mapped circuit. Simultaneous assignment of gates sizes and back-gate bias voltages for FinFETs was studied in [9], where sized and biased FinFET circuits were compared to bulk CMOS circuits at the same channel length and shown to have lower leakage as well better area and delay characteristics.

In light of the above related efforts, the contributions of this paper are three-fold. Firstly, we have attempted to consolidate the above design suggestions in the design of a flexible FinFET-based standard cell library. Secondly, we propose a novel hybrid logic design technique that combines back-gate biasing and merged transistors. Finally, we consider low-power logic design with our standard cell library and demonstrate significant power savings under various output delay constraints.

Unlike [9], where leakage power consumption was only considered during the standby mode, we consider leakage power dissipation in the active mode also and optimize total active power. Standby-mode leakage power consumption can be controlled using a variety of system-level techniques, such as power gating and minimum-leakage input vector application. On the other hand, active-mode leakage power typically requires the use of circuit techniques such as multiple threshold voltages, multiple power supply voltages or a combination of the two [11, 20]. Our goal is to explore new avenues available in FinFET circuits to control active-mode leakage through threshold voltage control and/or the construction of area-efficient logic structures.

# 3 Logic Design

In this section, performance and power characteristics of FinFET logic gates using transistors in various connected configurations (modes) are considered. Some guidelines for "back of the envelope" logic design with FinFETs are also presented. Three modes of FinFET operation may be identified, *viz.* the shorted-gate (SG) mode with transistor gates tied together, the low-power (LP) mode where the back-gate is tied to a reverse-bias voltage to reduce leakage power, and the independentgate (IG) mode where independent digital signals are used to drive the two device gates. Implementations of a two-input NAND gate in each of the above modes are depicted in Figure 1. A hybrid IG/LP mode NAND gate, which employs a combination of LP and IG modes, is also presented. These are used as the vehicle for discus-





sion in this section. Before proceeding with the design of NAND gates, we consider some characteristics of the FinFET device which have a bearing on digital design.

<sup>&</sup>lt;sup>1</sup>The term *selection* is used to indicate that both the size and design style of a cell are being selected.



# 3.1 Transistor characteristics and some qualitative arguments

SPICE-simulated DC transfer characteristics, *i.e.*,  $I_{ds}$  versus  $V_{g_fs}$ , for a 32nm N-type FinFET are shown in Figure 2. Here,  $V_{g_fs}$  denotes the potential difference between the front gate  $(g_f)$  and source terminals. The transistor's source terminal was tied to ground for these simulations and the drain was tied to the power supply. Transfer characteristics are presented for various backgate voltages  $(V_{g_bs})$ . A predictive technology model (PTM) for 32nm FinFETs, available from [21], was used for this and all other SPICE experiments reported in this paper. PTM has been validated against laboratory measurements with 32nm FinFETs [22]. The power supply was fixed at 1V. Curves corresponding to SG, LP and IG modes of operation are indicated. Similar results, not shown, were also obtained for a P-type FinFET.

The operating temperature was fixed at  $70^{\circ}$ C for all simulations. It was shown in [23] that FinFET structures suffer from considerable self-heating and the operating temperature in FinFET circuits varies directly with switching activity. Also, thermal simulations in [23] are shown to yield a temperature close to  $70^{\circ}$ C if the switching activity is assumed to be 0.1, as in this paper.

The variation in on- and off-state FinFET currents,  $I_{on}$  and  $I_{off}$ , across the three modes of FinFET operation is notable. FinFETs offer the best drive strength in the SG mode.  $I_{on}$  reduces by about 60% in the IG and LP modes. Application of a reverse-bias on the back gate in the LP mode leads to further reduction in  $I_{on}$ , albeit at a smaller rate<sup>2</sup>. However, a FinFET with one gate fed by logic 0, as in the pull-up P-type FinFET of an IG-mode NAND gate, is not a significantly better driver than a FinFET with a reverse-biased back-gate (LP mode).  $I_{off}$ , on the other hand, decreases much more rapidly with increasing reverse-bias. A strong reverse-bias reduces  $I_{off}$  by more than an order of magnitude, compared to the SG mode, which displays the highest  $I_{off}$ .

It is useful to consider the implications of the above device characteristics on the design of an LP-mode Fin-FET inverter. Figure 3 plots the variation in average delay and leakage power against change in the back-gate



Figure 3. LP-mode FinFET inverter delay and leakage power variation with  $V_{q_{h}s}$ 

bias voltage for a minimum-sized LP-mode inverter, driving a load four times its size and driven by a slope of 5ps. Both pull-up and pull-down were driven by a back-gate bias of equal strength in this experiment. For instance, if the strength of the back-gate bias was 0.2V, a voltage of -0.2V was used for the back-gate bias of the pull-down FinFET and a voltage of 1.20V was used to bias the pull-up FinFET. The figure also depicts delay and leakage for an SG-mode inverter. It can be seen that inverter delay degrades sharply in going from the SG mode to zero reverse-bias LP mode, and more slowly with increasing reverse-bias. The leakage current, however, depends strongly on the back-gate bias. The leakage curve shows an initial sharp decline but flattens out at back-gate bias voltages exceeding 0.26V. Further increase in the bias can only lead to delay/area overheads without much corresponding savings in leakage. With this in mind, for further experiments in this paper, we set the back-gate bias for N-type FinFETs at -0.26V. For P-type FinFETs, the back-gate voltage was adjusted to 1.18V to equalize rise and fall delays.

## 3.2 Design of logic gates

In this section, we consider the design of NAND gates depicted in Figure 1. The gates shown in Figures 1(a), 1(b) and 1(c) are the SG-, LP- and IG-mode NAND gates, respectively, which were alluded to in the beginning of Section 3. The IG-mode NAND gate is designed according to a compact logic style proposed in [14]. The fourth gate design, shown in Figure 1(d), is a novel hybrid design, which uses FinFETs in both IG and LP modes. We will demonstrate that it achieves better-matched delay characteristics and lower leakage

 $<sup>^{2}</sup>$ The variation of  $I_{on}$  with increasing reverse-bias is not visible on the logarithmically-scaled y axis of Figure 2.

power consumption than the IG-mode gate.

Let us first consider transistor sizing for each gate. All gates in this study were designed to have the minimum possible size that the available FinFET models allowed. All FinFETs, in the pull-up and pull-down blocks, respectively, were sized equally. Let the ratio between the widths of a pull-up  $(W_p)$  and a pull-down FinFET  $(W_n)$  be denoted as  $\beta$ . To obtain  $\beta$ , we used the following considerations:

1) Electron mobility exceeds hole mobility by 1.5 to 2 times.

2) The electrical width of a FinFET is quantized according to the number of fins. FinFETs are implemented as parallel fins between the source and drain regions. It is assumed that wider transistors can only be obtained by increasing the number of fins. The height of each fin is assumed to be fixed.

3) As we observed in Section 3.1, using a FinFET in the LP or IG modes reduces its drive strength by almost 60%.

Values for  $\beta$ , and estimates for input capacitance, off-state current consumption under different input vectors, and delay for each gate are given in Table 1. Assuming a ratio of 2 between electron and hole mobility, a matched CMOS NAND gate may be designed with  $\beta = 1$  [24]. The SG-mode NAND gate can be obtained by directly translating the CMOS NAND design to FinFETs, while retaining the same sizing. Table 1 reports delay measurements obtained using HSPICE, under three load conditions: unloaded and with loads of four (FO4) and twenty (FO20) minimum-sized SG-mode FinFET inverters, respectively, for each design mode. An input slope of 5ps was used to drive the gates.

In the LP-mode gate, the drive strength of every Fin-FET is reduced equally. Thus, we can continue to use  $\beta = 1$ . As expected, the average delay of the LP-mode gate is almost twice that of the SG-mode gate. On the other hand, the input capacitance of an LP-mode gate is only half that of an SG-mode gate, because only one FinFET gate is driven by the input signal. More significantly, leakage power, averaged over all input combinations, is reduced by over 90% because of threshold voltage control.

The IG-mode gate was designed to have asymmetric rise and fall delays [14]. Only one transistor gate is used for pull-up in the IG-mode NAND gate. To achieve balanced rise and fall delays, the pull-up would need to be scaled up. However, using equally sized pullup and pull-down, *i.e.*,  $\beta = 1$ , yields savings in area, input capacitance and diffusion capacitance at the gate output. As a result, under unloaded conditions, the IGmode NAND gate has an average delay comparable to, or even better than, the SG-mode NAND gate, but consumes less area and power. Unfortunately, the asymmetry in the pull-up and pull-down drive strengths of an IG-mode gate can lead to large disparities in the rise and fall delays under conditions of greater load. If both transitions through a gate are critical, an IG-mode gate may not be suitable.

As an alternative, we propose a modification of the IG design, the IG/LP design. In the fashion of an IGmode gate, in the IG/LP mode, parallel transistors, *i.e.*, the pull-up for a NAND and pull-down for a NOR gate, are merged. However, unlike the IG design, delays are balanced by reducing the strength of the complementary series structure. This can be seen by comparing the IGand IG/LP-mode NAND gate results in Table 1. Strength reduction is achieved by tying the back gates of FinFETs in series to a strong reverse bias (see Figure 1(d)). Essentially, we propose to slow down the faster transition to match the transition made slow by merging transistors, in exchange for significant savings in leakage. At first sight, this might seem to be a large loss in performance. However, often an IG/LP-mode gate has better worst-case rise and fall delays than its IG-mode counterpart. For instance, IG/LP-mode NAND gates actually have a worst-case (rise) delay that is smaller than or comparable to their IG-mode counterparts, under all load conditions, because of reduced competition from the pull-down network during a rising transition at the output. The same observation applies to the falling transition for corresponding NOR gates. This might make IG/LP gates more useful in situations where both rising and falling transitions through a gate are critical. Also, the IG/LP-mode NAND gate shows savings in excess of 56% and 33% in leakage, averaged across input vectors, and switched capacitance, respectively, compared to the IG design, while retaining the same transistor area.

Summarizing this section, we have considered four design styles for digital logic structures using FinFETs. In the interest of brevity, this section presented data only for two-input NAND gates. However, the design techniques examined are generally applicable. We have also designed NOR and AND-OR-INVERT (AOI) gates using the same principles and observed similar trade-offs between power and delay. Including varied implementations of each logic gate, as proposed in this section, in a technology library provides a level of flexibility which might be used to obtain useful trade-offs in the powerdelay design space, as can be seen ahead.

#### 4 Evaluation Methodology

To evaluate the utility of the different FinFET modes, we constructed technology libraries consisting of cells in each mode. Power and area estimates for synthesized circuits from the ISCAS'85 benchmark suite using combinations of the above technology libraries were used to

| Design | 1<br><sub>3</sub> | $C_{in}(aF)$ | $I_{off}(nA)$ |       |       |       | Unloaded delay $(ps)$ |      |      | FO4 delay $(ps)$ |       |       | FO20 delay $(ps)$ |       |       |
|--------|-------------------|--------------|---------------|-------|-------|-------|-----------------------|------|------|------------------|-------|-------|-------------------|-------|-------|
| mode   | ρ                 |              | 00            | 01    | 10    | 11    | Fall                  | Rise | Avg  | Fall             | Rise  | Avg   | Fall              | Rise  | Avg   |
| SG     | 1                 | 340          | 14.1          | 230.3 | 210.6 | 636.4 | 1.40                  | 1.72 | 1.56 | 6.29             | 7.75  | 7.02  | 24.48             | 32.12 | 28.30 |
| LP     | 1                 | 170          | 0.6           | 10.6  | 9.0   | 65.8  | 2.25                  | 3.47 | 2.86 | 12.39            | 15.13 | 13.76 | 52.33             | 66.47 | 59.40 |
| IG     | 1                 | 255          | 14.1          | 230.3 | 210.6 | 318.2 | 0.47                  | 2.56 | 1.51 | 5.85             | 14.97 | 10.41 | 23.96             | 64.41 | 44.18 |
| IG/LP  | 1                 | 170          | 0.6           | 10.6  | 9.0   | 318.2 | 1.97                  | 1.66 | 1.82 | 12.06            | 14.07 | 13.07 | 51.97             | 64.47 | 58.22 |

Table 1. Transistor sizing, input capacitance, leakage current, and delay measurements for the NAND gates in Figure 1

quantify the utility of each library. The design of the above technology libraries is considered next.

Four libraries in each mode, viz. SG, IG, LP and IG/LP, were designed. The following cells were included in the SG- and IG-mode libraries: inverters, twoinput NAND and NOR cells and four-input AOI cells. There were four sizes for the NAND, NOR and AOI cells, X1, X2, X4 and X8, and eight sizes for the inverter, X1, X2, ..., X8. The other two libraries consisted of the same cells configured in the LP and IG/LP modes, respectively. As we observed in Section 3.2, an LP- or IG/LP-mode cell has approximately half the drive strength and loading capacitance of the corresponding SG-mode cell of equal width. Thus, to obtain LPmode cells with equivalent delay and load compared to their SG-mode counterparts, larger cell sizes were included in this library. Thus, NAND, NOR and AOI cells were made available in five sizes, X1, X2, X4, X8 and X16, and inverters in nine sizes, X1, X2, ..., X8 and X16. The libraries were obtained by simulating the delay, leakage and short-circuit power consumption of each constituent cell in HSPICE. Transistor capacitance was also measured using HSPICE. Interconnect delay and load were also modeled. Fanout and size-dependent wire load models were obtained by scaling the wire characteristics available as part of a 130nm technology library [25], according to the presented method in [26].

Gate size and mode assignment: Circuits were synthesized by mapping them to a minimum-delay configuration using only SG-mode gates with Synopsys Design Compiler. To evaluate the utility of different design styles, a custom linear programming based algorithm was used to assign gate sizes and modes to the mapped circuit by selecting cells from the various aforementioned technology libraries. Cell selection was based on the algorithm presented in [11, 20], where it was shown that linear programming based sizing can achieve 15%-30% better power consumption than Synopsys Design Compiler. We consistently observed similar improvements in power consumption for our custom sizer compared to Design Compiler's built-in gate sizing algorithm. Therefore, only results obtained with our linear programming based sizing implementation are reported here.

Gate-sizing algorithms commonly proceed by conducting a local search amongst candidate cells for each gate to evaluate the gate's power-delay sensitivity. The power-delay sensitivity is defined as the ratio  $\frac{\triangle P}{\triangle D}$  between the reduction in power ( $\triangle P$ ) and the degradation in delay ( $\triangle D$ ) if an alternate cell is used. The sensitivity is thus a measure of the efficiency of each gate. Classical gate-sizing algorithms greedily assign available slack to the gate with the best sensitivity, *i.e.*, the gate which maximizes  $\frac{\triangle P}{\Delta D}$ . On the other hand, in the linear programming based sizing formulation of [11, 20], a linear program is used to divide available slack amongst gates whose cells can change, which can avoid suboptimal decisions made by the greedy approach. In the remainder of this section, we review the cell selection algorithm presented in [11] and describe our enhancements to it.

An iterative sizing algorithm, which is able to select alternative cells for any number of gates in the circuit during each iteration, is presented in [11]. Each iteration begins with an alternative cell, with the highest powerdelay sensitivity, being chosen for each gate. A linear program is then formulated in terms of a cell change variable,  $\gamma_v \in [0,1]$ , for each gate v in order to optimize total power. A value of  $\gamma_v$  close to 1 indicates that the cell of gate v may be changed to the chosen alternate cell. A 0 value indicates no change. At the end of the iteration, pre-selected alternative cells replace each gate vfor which a high value of  $\gamma_v$  was obtained. Equation (1) gives the objective used to optimize power in the linear programming formulation. Delay constraints at individual gates (illustrated in Figure 4<sup>3</sup>) and at the circuit outputs are given in Equations (2) and (3), respectively.

$$\min\left(\sum_{v\in V}\gamma_v\triangle P_v\right)\tag{1}$$

$$t_{vw}^{r} \geq t_{uv}^{f} + d_{uv}^{r} + \gamma_{v} \left( \bigtriangleup t_{uv,v}^{f} + \bigtriangleup d_{uv,v}^{r} + \beta_{vw}^{f} \bigtriangleup s_{uv,v}^{r} \right) + \sum_{x \in fanout(v), x \neq w} \gamma_{x} \left( \bigtriangleup d_{uv,x}^{r} + \beta_{vw}^{f} \bigtriangleup s_{uv,x}^{r} \right)$$
(2)

$$\max_{v \in outputs} \{t_v^r, t_v^f\} \le T_{max} \tag{3}$$

In the above equation,  $T_{max}$  is the maximum allowed signal arrival time at circuit outputs. For simplicity, all timing arcs are taken to have a negative polarity.  $t_{uv}^f$ is the falling arrival time at gate v from gate u.  $d_{uv}^r$  is

<sup>&</sup>lt;sup>3</sup>Reproduced from [11].



#### Figure 4. A circuit to illustrate delay constraints

the delay from the signal on uv to the output of v rising.  $riangle t^f_{uv,v}$  is the change in  $t^f_{uv}$  due to the cell of vchanging.  $\triangle d_{uv,x}^r$  is the change in  $d_{uv}^r$  due to the cell of gate x changing.  $\triangle s_{uv,x}^r$  is the corresponding change in the rising slew of gate  $v, s_{uv}^r$ .  $\beta_{vw}^f$  is a sensitivity term that encapsulates the impact of  $\triangle s_{uv}^r$  on delay  $d_{vw}^f$ . We added the  $\triangle t^f_{uv,v}$  term to the original formulation in [11] and found that it provides somewhat improved accuracy. The term helps to better model the effect of a change in fanout load of gate u on its delay. The above linear programming formulation is solved iteratively, with new cells being selected for gates that have a high  $\gamma$  value. After each change of cells, timing analysis is performed to check if the maximum arrival-time constraint (ATC) has been violated. Such violations are possible because the above timing constraints are based on individual gates changing, and simultaneous changes in a gate and its fanin are not modeled. If the ATC is violated, iterative delay minimization is performed similarly to the algorithm in [11]. The delay minimization procedure simply minimizes the maximum arrival time at circuit outputs subject to the constraints in Equation (2) above. Details may be found in [11]. Introduction of the  $riangle t^f_{uv,v}$  term in the delay equation above was found to significantly reduce the incidence of the ATC being violated.

We describe another modification from [11], which was suggested, but not implemented, in [11]. In [11], the cell of gate v is only changed to a lower-power alternative if the value of the cell choice variable  $\gamma_v$  exceeds a threshold set at 0.99. We found that such a high threshold significantly reduced the potential for cells to be changed and as a result, the total power savings. Instead, a lower value (0.5) was found to work better. The alternate cell for gate v is then chosen by minimizing power amongst candidate cells for which  $d'_v \leq d_v + \gamma_v \triangle d_v$ , where  $d'_v$  is the delay through v after a cell change. Worst-case slew impact on fanout delay due to the cell change is included in  $d'_v$ . This allows an alternative with lower slack to be used if the initially chosen alternative cell for gate v does not meet the allotted slack. The optimization is conservative since it does not violate any delay constraints.

## **5** Experimental Results

We synthesized power-optimized circuits from the ISCAS'85 benchmark suite using various combinations

of libraries. The combinations we studied comprised cells in the LP and SG modes, IG/LP and SG modes, and IG and SG modes, respectively. Circuits synthesized using the above combinations of libraries were compared against baseline circuits purely comprising SGmode cells and operated at the same ATC.

The following procedure was used to obtain poweroptimized circuits in each case. Minimum-delay configurations were obtained using Design Compiler Compiler, as mentioned earlier. Power was optimized using linear programming based cell selection under successively relaxed ATC. Power minimization requires that the circuit be given some slack so that gates may be converted to cells of smaller sizes or from the SG mode to one of the other, slower/lower-power modes of operation. We report detailed power results for each benchmark at 120% ATC in Table 2. In Figures 5, 6 and 7, we report trends for the constitution of benchmark circuits by mode of FinFET operation, total number of fins required to implement all benchmarks, and power consumption for each of the above combination of libraries.

#### 5.1 Results at 120% ATC

Let us first consider the power values reported in Table 2. Power was measured with the value of input switching activities set to 0.1. At 120% ATC, leakage power in power-optimized SG-mode circuits contributes 31.3% of the total power. Synthesizing the circuits using a combination of LP- and SG-mode gates reduces the leakage power by 64.74% and overall power by 14.51%, on an average. A modest increase in dynamic power is observed, yet, total active power reduces because of the sizable reduction in leakage power. A tight ATC can be met in spite of the significant reduction in power because an SG-mode cell with a given delay can be replaced by a larger LP-mode cell with the same delay but significantly reduced leakage power consumption. Figure 5 plots, for each combination of libraries<sup>4</sup>, the percentage of gates that were operated in a mode different from the SG mode. It can be seen that, around 82% of the gates in the combined LP- and SG-mode circuits are operated in the LP mode. An unfortunate side effect of this high conversion rate to LP-mode gates is a large increase in FinFET area. Total FinFET area, plotted in Figure 6, increases on an average by 122.6% compared to pure SG-mode power-optimized circuits at the same ATC.

To explain the above area increase, let us reconsider our power-optimizing cell selection procedure. The procedure starts with minimum-delay, purely SG-mode circuits implemented using Design Compiler. The total number of fins required to implement these circuits is

<sup>&</sup>lt;sup>4</sup>Columns XX in Figures 5, 6 and 7 refer to the combination of mode XX with the SG mode. Column names can be read from the legend in Figure 6.

shown in Figure 6. In obtaining power-optimized SGmode circuits at a 120% ATC, a sizable reduction in the number of fins (56.9%) was observed. With the availability of LP-mode gates, however, the corresponding reduction was much smaller (4.1%). If only SGmode gates are available, a power-minimizing algorithm must reduce FinFET area since power, both dynamic and leakage, is directly proportional to it. However, LP-mode gates present a different, but very attractive, trade-off to the sizer: power can be reduced by swapping an SG-mode gate with a larger LP-mode gate, a move which preserves delay through the gate, causes a slight increase in dynamic power, but leads to overwhelming savings in leakage power. IG- and IG/LP-mode gates provide two other design options, which are both more economical of area than LP-mode gates, even though the reduction in leakage is much lower. From Table 2 and Figure 6, it can be seen that a mix of IG/LP- and SGmode gates provides more modest overall power savings (6.91%) and overall area increase (62.2%). Average leakage power saving is also reduced, but still significant at 20.66% (See Table 2).

Unfortunately, IG-mode gates are not equally useful: their availability in the cell library yields only small savings in power consumption compared to pure SGmode circuits. While IG-mode gates have smaller capacitance and, therefore, smaller power consumptions, only around 16.33% of the gates in the final power-optimized circuits, at a 120% ATC, were actually operated in the IG mode (see Figure 5), thus limiting overall power benefits.

## 5.2 Trends in power savings and Fin-FET area overhead across ATCs

Next, we consider trends in overall power savings and FinFET area overheads with different FinFET libraries at successively relaxed ATCs. We note that in all cases, pure SG-mode circuits provide the lowest area, but at the highest power consumption (see Figures 6 and 7). At relaxed ATCs, the selection algorithm has more overall slack to allocate to individual gates, leading to higher savings in power, as well as reduced area overheads. Figure 7 shows that overall power savings with mixed LP- and SG-mode circuits are as high as 37.25% at a 200% ATC. Leakage power savings were even higher at 91.23%. Area overhead is also significantly reduced, at 66.9% percent compared to 122.6% at a 120% ATC.

The other two configurations, *viz.* mixed IG and SG modes and mixed IG/LP and SG modes, respectively, also show improvements in power consumption and area. Consistently, however, a larger fraction of gates were converted to the IG/LP-mode than to the IG-mode from the SG-mode. As a result, power savings with the use of SG- and IG/LP-mode gates (22.57% at 200%)

ATC) were consistently higher compared to using SGand IG-mode gates (13.64% at 200% ATC). The IGmode configuration is, however, more economical in area than the IG/LP-mode configuration.

Based on the above results, we can suggest that LPmode gates are attractive candidates for power reduction in circuits with relaxed area/ATC. Since ATCs in our experiments are aggressively defined, these relaxed constraints might actually suffice for a large class of circuits. It is perhaps more advantageous to include IGor IG/LP-mode cells in technology libraries meant for synthesis of circuits with tight area/delay constraints.



Figure 5. Constitution of circuits by mode



## 6 Conclusions

In conclusion, we have discussed various logic styles for low-power FinFET circuits. We demonstrated that the rich diversity of design styles, made possible by independent control of FinFET gates, can be used effectively to reduce total active power consumption in digital circuits. We presented a systematic study of the area overheads and power savings with different FinFET design styles and provided a new hybrid design style that

| Design  | Power consumption $(\mu W)$ |         |          |          |           |          |          |            |          |                 |         |          |
|---------|-----------------------------|---------|----------|----------|-----------|----------|----------|------------|----------|-----------------|---------|----------|
| Design  |                             | SG mode |          | SG       | and LP mo | des      | SG a     | nd IG/LP m | iodes    | SG and IG modes |         |          |
|         | Dynamic                     | Leakage | Total    | Dynamic  | Leakage   | Total    | Dynamic  | Leakage    | Total    | Dynamic         | Leakage | Total    |
| c432    | 246.91                      | 154.91  | 401.81   | 286.30   | 54.02     | 340.32   | 246.40   | 113.44     | 359.84   | 247.78          | 154.34  | 402.12   |
| c499    | 2411.09                     | 461.61  | 2872.70  | 2548.74  | 255.76    | 2804.50  | 2412.39  | 423.38     | 2835.77  | 2418.99         | 469.95  | 2883.94  |
| c880    | 610.28                      | 300.20  | 910.48   | 661.061  | 64.55     | 725.61   | 587.44   | 220.43     | 807.87   | 598.00          | 280.30  | 878.36   |
| c1355   | 2758.91                     | 511.86  | 3270.77  | 2844.91  | 295.08    | 3139.99  | 2734.16  | 438.68     | 3172.84  | 2780.61         | 519.31  | 3299.92  |
| c1908   | 930.81                      | 397.46  | 1328.27  | 1000.61  | 148.20    | 1148.80  | 921.91   | 329.75     | 1251.65  | 925.82          | 374.44  | 1300.26  |
| c2670   | 1474.67                     | 521.78  | 1996.45  | 1602.14  | 150.17    | 1752.31  | 1437.89  | 394.48     | 1832.37  | 1476.34         | 502.03  | 1978.37  |
| c3540   | 1014.36                     | 752.38  | 1766.74  | 1177.08  | 161.61    | 1338.69  | 1005.79  | 540.22     | 1546.01  | 1007.77         | 674.91  | 1682.68  |
| c5315   | 2462.13                     | 1011.43 | 3473.56  | 2636.14  | 241.997   | 2878.13  | 2385.27  | 793.06     | 3178.33  | 2418.31         | 882.14  | 3300.45  |
| c6288   | 2587.90                     | 2556.27 | 5144.17  | 3306.00  | 416.62    | 3722.62  | 2705.72  | 1996.80    | 4702.52  | 2585.01         | 2504.52 | 5089.53  |
| c7552   | 3304.69                     | 1442.09 | 4746.78  | 3548.50  | 324.78    | 3873.28  | 3251.27  | 1183.68    | 4434.95  | 3276.13         | 1347.50 | 4623.6   |
| Total   | 17801.75                    | 8109.98 | 25911.73 | 19280.44 | 2859.72   | 22140.15 | 17688.25 | 6433.91    | 24122.16 | 17734.83        | 7704.42 | 25439.25 |
| Savings | 0                           | 0       | 0        | -8.31%   | 64.74%    | 14.51%   | 6.37%    | 20.66%     | 6.91%    | 0.38%           | 4.99%   | 1.82%    |

Table 2. Power consumption of ISCAS'85 benchmarks with various FinFET logic libraries

enables useful trade-offs between circuit area and power consumption.

## References

- [1] E. J. Frank, R. H. Dennard, E. Nowak, P. M. Solomon, Y. Taur, and H.-S. P. Wong. Device scaling limits of Si MOSFETs and their application dependencies. *Proc. IEEE*, 89(3):259–288, Mar. 2001.
- [2] 2005 International Technology Roadmap for Semiconductors. http://www.itrs.net/.
- [3] T.-J. King. FinFETs for nanoscale CMOS digital integrated circuits. In *Proc. Int. Conf. Computer-Aided De*sign, pages 207–210, Nov. 2005.
- [4] L. Wei, Z. Chen, and K. Roy. Double gate dynamic threshold voltage (DGDT) SOI MOSFETs for low power high performance designs. In *Proc. IEEE Int. SOI Conf.*, pages 82–83, Oct. 1997.
- [5] P. Beckett. A fine-grained reconfigurable logic array based on double gate transistors. In *Proc. IEEE Int. Field-Programmable Technology Conf.*, pages 260–267, Dec. 2002.
- [6] I. Aller. The double-gate FinFET: Device impact on circuit design. In *Proc. Int. Solid-State Circuits Conf.*, pages 14–15 (and visual supplements, pp. 655–657), Feb. 2003.
- [7] I. Yang, A. Chandrakasan, and D. Antoniadis. Backgated CMOS on SOIAS for dynamic threshold voltage control. *IEEE Trans. Electron Devices*, 44(5):822–831, May 1997.
- [8] E. J. Nowak, I. Aller, T. Ludwig, K. Kim, R. V. Joshi, C.-T. Chuang, K. Bernstein, and R. Puri. Turning silicon on its edge. *IEEE Circuits and Devices Magazine*, 20(1):20–31, Jan.-Feb. 2004.
- [9] B. Swahn and S. Hassoun. Gate sizing: FinFETs vs 32nm bulk MOSFETs. In Proc. ACM/IEEE Design Automation Conf., pages 528–531, July 2006.
- [10] J. Kao, S. Narendra, and A. Chandrakasan. Subthreshold leakage modeling and reduction techniques. In *Proc. Int. Conf. Computer-Aided Design*, pages 141–148, Nov. 2002.
- [11] D. Chinnery and K. Keutzer. Linear programming for sizing,  $V_{dd}$  and  $V_{th}$  assignment. In *Proc. Int. Symp. Low Power Electronics & Design*, pages 149–154, Aug. 2005.
- [12] L. Mathew et al. CMOS vertical multiple independent gate field effect transistor (MIGFET). In *Proc. IEEE Int. SOI Conf.*, pages 187–189, Oct. 2004.
- [13] D. M. Fried, E. J. Nowak, J. Kedzierski, J. S. Duster, and K. T. Kornegay. A fin-type independent double-gate NFET. In *Proc. Device Research Conf.*, pages 45–46, June 2003.

- [14] M.-H. Chiang, K. Kim, C. Tretz, and C.-T. Chuang. Novel high-density low-power logic circuit techniques using DG devices. *IEEE Electronic Device Lett.*, 52(10):2339–2342, Oct. 2005.
- [15] H. Mahmoodi, S. Mukhopadhyay, and K. Roy. High performance and low power domino logic using independent gate control in double-gate SOI MOSFETs. In *Proc. IEEE Int. SOI Conf.*, pages 67–68, Oct. 2004.
- [16] W. Zhang, J. G. Fossum, L. Mathew, and Y. Du. Physical insights regarding design and performance of independent-gate FinFETs. *IEEE Electronic Device Lett.*, 52(10):2189–2206, Oct. 2005.
- [17] P. Beckett. Low-power circuits using dynamic threshold voltage devices. In *Proc. Great Lakes Symp. VLSI*, pages 213–216, Apr. 2005.
- [18] T. Cakici, H. Mahmoodi, S. Mukhopadhyay, and K. Roy. Independent gate skewed logic in double-gate SOI technology. In *Proc. IEEE Int. SOI Conf.*, pages 83–84, Oct. 2005.
- [19] R. V. Joshi, K. Kim, R. Q. Williams, E. J. Nowak, and C.-T. Chuang. A high-performance, low leakage, and stable SRAM row-based back-gate biasing scheme in FinFET technology. In *Proc. Int. Conf. VLSI Design*, pages 665–672, Jan. 2007.
- [20] D. Nguyen, A. Davare, M. Orshansky, D. Chinnery, B. Thompson, and K. Keutzer. Minimization of dynamic and static power through joint assignment of threshold voltages and sizing optimization. In *Proc. Int. Symp. Low Power Electronics & Design*, pages 158–163, Aug. 2003.
- [21] W. Zhao and Y. Cao. New generation of predictive technology model for sub-45nm design exploration. In *Proc. Int. Symp. Quality of Electronic Design*, pages 585–590, May 2006. http://www.eas.asu.edu/~ptm.
  [22] W. Zhao and Y. Cao. Predictive technology model
- [22] W. Zhao and Y. Cao. Predictive technology model for nano-CMOS design exploration. ACM J. Emerging Technologies in Computing Systems, 3(1):1–17, Apr. 2007.
- [23] J. H. Choi, A. Bansal, M. Meterelliyoz, J. Murthy, and K. Roy. Leakage power dependent temperature estimation to predict thermal runaway in FinFET circuits. In *Proc. Int. Conf. Computer-Aided Design*, pages 583– 586, Nov. 2006.
- [24] I. Sutherland, R. F. Sproull, and D. Harris. *Logical Effort: Designing Fast CMOS Circuits*. Morgan Kaufmann, San Fransisco, CA, 1999.
- [25] G. Petley. VLSI and ASIC technology standard cell library design. http://vlsitechnology.org.
  [26] D. Sylvester and K. Keutzer. Getting to the bottom of
- [26] D. Sylvester and K. Keutzer. Getting to the bottom of deep submicron. In *Proc. Int. Conf. Computer-Aided De*sign, pages 203–211, Nov. 1998.