# Ultra-Low Power Phase-Locked Loops for Near-Threshold Voltage Operation

Joung-Wook Moon

The Graduate School

**Yonsei University** 

**Department of Electrical and Electronic Engineering** 

# Ultra-Low Power Phase-Locked Loops for Near-Threshold Voltage Operation

by

## Joung-Wook Moon

A Dissertation

Submitted to the Department of Electrical and Electronic Engineering and the Graduate School of Yonsei University in partial fulfillment of the requirements for the degree of

## **Doctor of Philosophy**

August 2015

This certifies that the dissertation of Joung-Wook Moon is approved.

Thesis Supervisor: Woo-Young Choi

Seong-Ook Jung

**Tae-Wook Kim** 

Jung-Hwan Choi

**Du-Ho Kim** 

**The Graduate School** 

**Yonsei University** 

August 2015

## **Table of Contents**

| Table of Contents | i  |
|-------------------|----|
| List of Tables    | iv |
| List of Figures   | v  |
| Abstract          | ix |

| 1. | Introduction                                             | . 1 |
|----|----------------------------------------------------------|-----|
|    | 1-1. Low Power Design                                    | . 1 |
|    | 1-2. Near-Threshold Voltage Design                       | . 5 |
|    | 1-3. Ultra-Low Power, Energy-Efficient Phase-Locked Loop |     |
|    | Design                                                   | . 8 |
|    | 1-4. Outline of Dissertation                             | 10  |

| 2. NTV | PLL Design Consideration      |  |
|--------|-------------------------------|--|
| 2-1.   | PFD Design Consideration .    |  |
| 2-2.   | CP Design Consideration       |  |
|        | 2-2-1. CP Non-ideal Effects . |  |
| 2-3.   | VCO Design Consideration      |  |
| 2-4.   | FD Design Consideration       |  |
| 2-5.   | Summary                       |  |

| 3. A 500 MHz Ultra-Low Power PLL for NTV operation | <b>n</b> |
|----------------------------------------------------|----------|
| 3-1. Introduction                                  | 41       |
| 3-2. Proposed NTV Supply PLL Architecture          | 43       |
| 3-3. NTV PFD Design                                | 46       |
| 3-4. Mismatch and Variation Tolerant CP            | 51       |
| 3-4-1. Proposed CP Analysis                        | 53       |
| 3-5. NTV VCO Analysis                              |          |
| 3-6. Dual-Loop AFC                                 | 66       |
| 3-7. TSPC and E-TSPC FD analysis                   |          |
| 3-8. Measurement Results                           | 75       |
| 3-9. Summary                                       |          |

| 4. NTV PLL Application Extension           | 83  |
|--------------------------------------------|-----|
| 4-1. Introduction                          | 83  |
| 4-2. A 90~350 MHz NTV PLL with ALF CP      | 84  |
| 4-2-1. Proposed PLL Architecture           | 85  |
| 4-2-2. Charge Pump with Active Loop-Filter | 86  |
| 4-2-3. VCO with AFC                        | 93  |
| 4-2-4. Measurement Results                 | 96  |
| 4-3. Summary                               | 101 |
|                                            |     |

| 5. Conclusion1 | 1( | 0 | ) | C | 2 |
|----------------|----|---|---|---|---|
|----------------|----|---|---|---|---|

| Bibliography    |        | ••••• | ••••• | ••••• | <br>••••• | 106 |
|-----------------|--------|-------|-------|-------|-----------|-----|
| Abstract (In Ko | orean) |       |       |       | <br>      | 112 |

## **List of Tables**

| <b>Table. 3-1.</b> | Reset delay simulation summary. | 50  |
|--------------------|---------------------------------|-----|
| Table. 3-2.        | Maximum FD operation frequency. | .74 |
| Table. 4-1.        | Power breakdown                 | 100 |
| Table. 5-1.        | Performance summary of NTV PLLs | 105 |

## **List of Figures**

| Figure 1-1. Development of dynamic and leakage power consumption over time                                                                                                                         |
|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| <b>Figure 1-2.</b> Energy efficiency of NTV operation. As the supply voltage is reduced, (a) the frequency reduces, and (b) the energy efficiency increases                                        |
| Figure 1-3. Power consumption and power efficiency of recent<br>ultra-low supply voltage PLLs                                                                                                      |
| Figure 2-1. (a) Conventional PFD circuit and (b) state diagram of conventional PFD                                                                                                                 |
| <ul><li>Figure 2-2. Timing diagram when (a) CP has no current mismatch,</li><li>(b) CP has a finite current mismatch, (c) CP has a finite current mismatch and PFD reset delay is varied</li></ul> |
| Figure 2-3. Simulated traditional PFD reset delay vs. supply voltage. 17                                                                                                                           |
| Figure 2-4. (a) Block diagram of PFD with CP and (b) timing diagram                                                                                                                                |
| <b>Figure 2-5.</b> Conventional single-ended CP structures: (a) Switch in Drain, (b) Switch in Source, and (c) Switch in Gate                                                                      |
| <b>Figure 2-6.</b> Schematics of current matching CP (a) with gain-boosting circuit, and (b) dual-loop compensation circuit                                                                        |
| Figure 2-7. Schematics of the DTCMOS CP                                                                                                                                                            |
| Figure 2-8. Impact of voltage scaling on gate delay variation                                                                                                                                      |
| <b>Figure 2-9.</b> Mismatching in the $C_P$ (a) Error caused by charge injection $Q_C$ and (b) charge sharing between $C_P$ and capacitors $C_{UP}$ and $C_{DN}$                                   |

| Figure 2-10. V-I characteristic curve for low supply voltage PMOS        |
|--------------------------------------------------------------------------|
| transistor. (a) Conventional gate-controlled PMOS and                    |
| (b) body-bias controlled PMOS                                            |
| Figure 2-11. (a) Schematic of a D-FF FD and (b) its frequency            |
| waveform                                                                 |
| Figure 2-12. Operation of TSPC flip-flop (a) hold mode and               |
| (b) evaluation mode                                                      |
| Figure 2-13. Timing diagram of TSPC FD                                   |
| Figure 2-14. Schematic of E-TSPC FD                                      |
| <b>Figure 2-15.</b> Timing diagram of E-TSPC FD                          |
| Figure 3-1. Impact of voltage scales on (a) power efficiency, and (b)    |
| VCO period and power consumption for simple VCO circuit 44               |
| Figure 3-2. Proposed PLL architecture                                    |
| Figure 3-3. Schematics of six PFD structures. (a) NAND latch1,           |
| (b) NAND latch2, (c) NOR D-FF, (d) NAND D-FF, (e) Glitch                 |
| latch D-FF, (f) Pass-transistor D-FF                                     |
| Figure 3-4. Simulated reset delay $(T_{RD})$ vs. supply voltage for      |
| each PFD with (a) SS corner, (b) TT corner, (c) FF corner 49             |
| Figure 3-5. Simulated $T_{RD}$ at 0.4 V supply, and (b) $T_{RD}$ process |
| variation                                                                |
| Figure 3-6. Proposed CP structure. 52                                    |
| Figure 3-7. (a) CP current mismatch compensation circuit,                |
| and (b) Simplified circuit                                               |
| Figure 3-8. (a) NMOS transistor, and (b) I-V characteristics             |
| with R <sub>OUT</sub>                                                    |
| Figure 3-9. Figure 3-9. Increasing the output resistance by adding       |
| the gain boosting technique (a) NMOS cascode circuit, (b) gain-          |
| boosting circuit and (c) gain circuit implementation                     |

#### v

| Figure 3-10. Proposed mismatch and variation compensation CP.              |
|----------------------------------------------------------------------------|
| (a) gain-boosting implementation for both node, $V_{REF}$ and $V_{CTRL}$ , |
| and (b) final gain-boosting circuit implementation with $V_{\text{REF}}$   |
| only60                                                                     |
| Figure 3-11. Simulated CP currents (a) without compensation, (b) with      |
| current mismatch compensation only and (c) with current                    |
| mismatch and variation compensation at 0.4-V supply61                      |
| Figure 3-12. Proposed VCO circuit and its delay cells                      |
| Figure 3-13. Proposed AFC circuit building blocks                          |
| Figure 3-14. Dual-loop AFC flow chart. 68                                  |
| Figure 3-15. Simulated result of (a) the digital output of the proposed    |
| AFC circuit, and (b) VCO control signal, $V_{CTRL}$                        |
| Figure 3-16. Equivalent RC model for (a) TSPC and (b) E-TSPC 71            |
| Figure 3-17. Test board for measurement                                    |
| Figure 3-18. Measurement setup                                             |
| Figure 3-19. Proposed PLL (a) layout and (b) die micro-photograph. 77      |
| Figure 3-20. Measured VCO frequencies for sub-bands selected with          |
| different AFC codes [0001], [0011], and [0111]                             |
| Figure 3-21. Measured and simulated PLL loop bandwidth for different       |
| VCO control voltages                                                       |
| Figure 3-22. Phase noise correlated with (a) simulation result and         |
| (b) measurement result                                                     |
| Figure 3-23. Measured output spectrum at 500 MHz                           |
| Figure 3-24. Measured output jitter at 500 MHz                             |
| Figure 4-1. Proposed ALF CPPLL architecture                                |
| Figure 4-2. Proposed CP with ALF                                           |

| Figure 4-3. Simulated CP output current against CP output voltage                                                                      |
|----------------------------------------------------------------------------------------------------------------------------------------|
| for different process corners                                                                                                          |
| Figure 4-4. Simulation of (a) CP output current, and (b) V <sub>CP</sub> and                                                           |
| V <sub>REF</sub> voltage against NMOS bias current variation for different process corners                                             |
| Figure 4-5. Two stage NMOS mirrored-OTA circuit                                                                                        |
| Figure 4-6. Proposed VCO with AFC circuit                                                                                              |
| Figure 4-7. Simulated VCO tuning range with different AFC codes 95                                                                     |
| Figure 4-8. Microphotograph and layout of the PLL                                                                                      |
| Figure 4-9. Measured power consumption and power efficiency with different output frequency                                            |
| <b>Figure 4-10.</b> Measured output spectra at 350 MHz: with the fixed CP reference voltage (0.25V) and with automatic compensation 98 |
| Figure 4-11. Phase noise of the PLL at 350 MHz                                                                                         |
| Figure 4-12. Measured timing jitter of the PLL at 350 MHz                                                                              |
| <b>Figure 5-1.</b> Power consumption and power efficiency of recent state-<br>of-the-art NTV PLLs and proposed PLLs                    |

Abstract

# Ultra-Low Power Phase-Locked Loops for Near-Threshold Voltage Operation

### Joung-Wook Moon

Dept. of Electrical and Electronic Engineering The Graduate School Yonsei University

Power consumption has become the primary design consideration for most integrated circuits (ICs), and low-power phase-locked loop (PLL), which is an essential block for IC's clock generation, is of great research interest. Aggressive supply-voltage scaling has been the most effective method of reducing power consumption. However, a reduction in supply voltage does not guarantee power efficiency and/or energy efficiency. In this dissertation, ultra-low power and high-energy efficiency PLLs, which are operated near-threshold voltage (NTV) region, are demonstrated in standard CMOS technology. In order to overcome lowvoltage headroom problem, a novel charge pump (CP) structures and PLL architectures are presented.

A fully integrated 500MHz ultra-low-power PLL that operates in 0.4-V supply is demonstrated using 65-nm standard CMOS technology. We present the results of CP structure that provides perfect current matching characteristics and voltage variation across the control voltage. In addition, this PLL consumes 127.8  $\mu$ W, which corresponds to power efficiency of 0.256 mW/GHz.

This work is extended by demonstrating different PLL architecture. An NTV PLL that includes an active-loop filter CP, resulting in a significant reduction in reference spur and amplifier's burden is demonstrated in 65-nm standard CMOS technology. The prototype achieves 90~350-MHz operation with 0.4-V supply voltage with power efficiency of 0.31 mW/GHz at 350 MHz. *Keywords*: Automatic frequency calibration (AFC), charge pump (CP), current mismatch, current variation, near-threshold voltage (NTV), phase-locked loop (PLL), ultra-low power, ultra-low voltage (ULV), voltage-controlled oscillator (VCO).

## **1. Introduction**

### 1-1. Low Power Design

Power consumption is the key limitation in many electronic systems nowadays, ranging from mobile communication systems to electronic data processing systems. In particular, when process technology is moved to the sub-micrometer area, the reduction in die size and increased number of transistors leads to a dramatic increase in the power consumption of integrated circuits (ICs).

Depending on the system and its application, there are two different sources for power dissipation in complementary metal-oxidesemiconductor (CMOS) circuits: dynamic power and static power [1].

$$P_{avg} = P_{dynamic} + P_{static} = (P_{short} + P_{switch}) + P_{static}$$
$$= I_{SC} \cdot V_{dd} + \alpha \cdot C_L \cdot V_{dd}^2 \cdot f + I_{leak} \cdot V_{dd}$$
(1-1)

 $P_{short}$  is the power consumed during gate voltage transient time, which is related to the direct path short circuit current,  $I_{sc}$ , that flows when both the PMOS and NMOS transistors are active simultaneously, conducting current from supply to ground.

The second term,  $P_{switch}$ , refers to the dynamic switching power,

where  $\alpha$  is the average switching activity factor,  $C_L$  is the load capacitance, and f is the clock frequency. The imperfect cut-off of transistors leads to leakage current,  $I_{leak}$ , and static power dissipation,  $P_{static}$ , even without any switching activity [2].

The charging and discharging of capacitances causes the dynamic power consumption when a circuit switches, and dynamic power reduction is a concern for most IC products. Reduced power consumption directly affects the product's operating time for batterypowered products. Even for wire-powered (battery-less) products, reduced power consumption brings several advantages, such as reduced packaging costs or higher performance because of lower temperatures.

The reduction of leakage power consumption is another primary concern for mobile products that spend most of their operating times in standby mode, such as cell phones.

For many process generations, however, leakage has increased roughly by a factor of 10 for every two process nodes [3]. Due to this dramatic increase in process technology, leakage current is becoming a significant contributor to overall IC power consumption even in normal operating mode, as can be seen in Figure 1-1 [4-6].

Leakage has been estimated to increase from 0.01% of overall power

consumption in a 1.0-µm technology to 10% in a 0.1-µm technology [3]. Despite this difficulty, total operating power consumption is proportional to the square of the supply voltage, and lowering the power supply voltage of ICs is the most direct and effective method of reducing power consumption. Therefore, new design methodologies and circuit techniques are essential to control and limit power consumption.



Figure 1-1. Development of dynamic and leakage power consumption over time [6]

#### **1-2. Near-Threshold Design**

Nowadays, as process technology is shrinking dramatically, on-chip transistors are doubled in every generation. The International Technology Roadmap for Semiconductors (ITRS) predicts that the supply voltage will decline below 0.5 V within a decade and continuously decrease to 0.43 V until the year 2026 [7].

However, a reduction in supply voltage scaling does not reduce energy per operation to all circuits. Therefore, the next big challenge is not just low power consumption, but also the energy efficiency of ICs.

Several studies show that near-threshold voltage (NTV) operation, where supply voltage is reduced close to the threshold voltage region, provides higher energy efficiency [8-10].

One NTV study shows that the frequency of operation reduces almost linearly and reducing performance linearly while lowering the supply voltage. Surprisingly, the result shows that active energy per operation reduces quadratically, and leakage power reduces exponentially as shown in Figure 1-2 [8]. In this figure, the supply voltage is reduced as the frequency reduces (a), and the energy efficiency increases (b) as expected. However, it peaks near the threshold voltage of the transistor and then starts reducing in the subthreshold region.

5



Figure 1-2. Energy efficiency of NTV operation. As the supply voltage is reduced, (a) the frequency reduces, and (b) the energy efficiency increases [8].

This unexpected reduction in the sub-threshold region is explained by noticing the following. In the sub-threshold region leakage power dominates, and it reduces with voltage, but the reduction in frequency is larger than the reduction in the leakage power, reducing energy efficiency [8]. Therefore, it is desirable to operate near the threshold voltage of the transistor in order to maximize the energy efficiency which is 9.6 times higher than the nominal supply voltage.

Other NTV operation studies have produced measurements confirming results with 45-nm, and 32-nm technology [11, 12]. These clearly show that NTV operation improve the energy efficiency of ICs across technology generations.

## **1-3. Ultra-Low-Power, Energy-Efficient Phase-Locked** Loop (PLL) design

The integration of analog and digital circuits into single mixed-signal devices, such as system-on-chips (SOCs) or network-on-chips (NOCs), has spread widely due to CMOS technology, as mentioned earlier. As a result, advances in process technology will continue providing an abundance of transistors for integration, only to be limited by the energy consumption. Especially, in a mixed-signal IC design, the analog and digital blocks prefer to operate with the same supply voltage in order to avoid the need for additional high-voltage supplies or DC-DC voltage converters. Therefore, it is natural to design all of the IC's building blocks with an NTV supply in order to obtain higher energy efficiency.

The Phase-locked loop (PLL) is one of the essential power-intensive building blocks for SOC clock generation, and reducing the supply voltage of the PLL is a significant research interest. Figure 1-3 shows the power consumption and power efficiency of recent ultra-low supply voltage PLLs.



Figure 1-3. Power consumption and power efficiency of recent ultra-low supply voltage PLLs.

#### **1-4. Outline of Dissertation**

The purpose of this dissertation is to realize ultra-low-power, powerefficiency PLLs with standard CMOS technology. Our PLLs are demonstrated under 0.4-V of NTV supply with newly proposed circuits and structures to achieve maximum-power efficiency. The remainder of this dissertation is organized as follows.

In Chapter 2, we review the fundamentals of a PLL and its design consideration for NTV operation. Several literature reviews of each of the PLL building blocks are performed. The performance impacts on phase-frequency detectors (PFDs) caused by process, voltage, and temperature, and non-ideal effects on charge-pump (CP) designs are studied. Several voltage-controlled oscillators (VCOs) and frequency dividers (FDs) operating NTV are also summarized.

In Chapter 3, a detailed implementation of a fully integrated 500MHz ultra-low-power PLL that operates an NTV supply is presented using 65-nm standard CMOS technology. This PLL consumes 127.8  $\mu$ W at a 0.4-V supply voltage, which corresponds to the best power efficiency of 0.256 mW/GHz. In addition, a new ultra-low-power CP structure that overcomes non-idealities is proposed and analysized.

This work is extended in Chapter 4. A 0.4-V PLL including an active-loop filter architecture, resulting in a significant reduction in reference spur and OP-Amp's slew-rate burden is demonstrated. This PLL consumes 109  $\mu$ W for 350-MHz output, corresponding to power efficiency of 0.31 mW/GHz.

Finally, Chapter 5 concludes the dissertation with a performance summary of the NTV PLL and comparisons against state-of-the-art PLLs.

## 2. NTV PLL Design Consideration

PLLs are indispensable analog circuits for most systems. However, designing a high-performance PLL with an NTV supply poses great challenges. With an NTV supply, many analog problems (e.g., sensitivity to supply noise, negative effects of process corner, voltage, temperature (PVT) variations, and operating frequency restrictions) become more critical than with typical supply voltages.

This chapter begins by reviewing the fundamentals and design considerations of the PLL building blocks: a phase-frequency detector (PFD), a charge pump (CP), a voltage-controlled oscillator (VCO), and a frequency divider (FD). After that, each section discusses the details of the design challenges for achieving ultra-low-voltage operation.

### 2-1. PFD Design Consideration

A PFD is an important building block of PLLs that converts the phase or frequency difference between the PLL reference clock ( $f_{REF}$ ) and the frequency-divided VCO clock ( $f_{DIV}$ ) to an UP or DN signal, as shown in Figure 2-1 (a). An active UP output tells the PLL to raise the frequency of the VCO, since the VCO is lagging behind the input signal. An active DN output tells the PLL the opposite. Therefore, UP or DN active signal outputs give the phase error direction. The magnitude of the phase error is indicated by the width of the UP or DN pulse, whichever applies.



Figure 2-1. (a) Conventional PFD circuit and (b) state diagram of conventional PFD.

Figure 2-1 (b) shows a state diagram summarizing the conventional tri-state PFD operation. If the PFD is in the initial state, "Init.state," UP = DN = 0, then a rising edge on  $f_{REF}$  takes it to the "UP state," where UP = 1, DN = 0. The circuit remains in this state until a rising edge occurs on  $f_{DIV}$ , upon which the PFD returns to "Init.state." The switching sequences of "Init.state" and "Down state" are similar [13]. However, when the phase difference ( $\theta_{diff}$ ) between  $f_{REF}$  and  $f_{DIV}$  is too small, the UP or DN pulse is very narrow and they cannot transfer the pulse signal to the next circuit, which is usually a CP circuit. This phenomenon introduces a range in which CP cannot respond to  $\theta_{diff}$ , called a "dead zone," causing a PLL static-output jitter. In order to eliminate the dead zone, the PFD must have a reset state, "Rst. state," where UP = DN = 1. If the PFD falls in this state, it will return to "Init.state" after a time delay  $T_{RD}$ . From the point of view of CP, the subtraction of the charging current and discharging current is ideally zero when UP = DN = 1 as when UP = DN = 0. Therefore, the CP average output current is proportional to the pulse width difference between UP and DN, which is equal to  $\theta_{diff}$  even with a very small  $\theta_{diff}$ , and the dead zone can be eliminated.

Under the NTV condition, the PFD reset delay  $(T_{RD})$  is expanded significantly, because the logic gate delay time is exponentially increased by lowering the supply voltage. The expanded  $T_{RD}$  can easily eliminate the dead-zone problem, but a large  $T_{RD}$  can also introduce a serious PLL static output jitter.

If CP is assumed ideal,  $T_{RD}$  does not become a problem. For example, Figure 2-2 (a) shows the timing diagram when CP is ideal and  $f_{REF}$  and  $f_{DIV}$  have the same phase. Because  $\theta_{diff}$  is zero, UP and DN rise high and fall low simultaneously. A charging current ( $I_{UP}$ ) and discharging current ( $I_{DN}$ ) rise and fall at the same time, and no charge is accumulated at the CP output voltage ( $V_{CTRL}$ ) node. As a result, PLL is locked to maintain this condition.



Figure 2-2. Timing diagram when (a) CP has no current mismatch, (b) CP has a finite current mismatch, (c) CP has a finite current mismatch and PFD reset delay is varied.

Figure 2-2 (b) shows a case in which CP is assumed to have a finite and fixed current mismatch. In this example,  $I_{UP}$  is assumed larger than  $I_{DN}$ . Due to this mismatch, the CP discharging time must be longer than the CP charging time in order to make the CP average current output zero, which means that the area of the gray-patterned rectangle is equal to the area of the diagonal-line-patterned rectangle in Figure 2-2 (b). Therefore, the DN pulse width must be larger than the UP pulse width, and the PLL is locked containing a finite phase offset ( $\theta_{offset}$ ), which the PLL output clock phase slightly fluctuates by a reference spur.

The more serious problem with the NTV PLL is that  $T_{RD}$  is sensitive to PVT variations. When the PLL is assumed to be locked and the supply voltage is decreased suddenly by unpredictable noise, the  $T_{RD}$  is expanded and charge is accumulated at the  $V_{CTRL}$  node, as shown in Figure 2-2 (c). In this case, the PLL will try to lock again by extending the DN pulse width in order to make the CP average output current zero. As a result,  $\theta_{offset}$  will change, introducing a large PLL output jitter. The jitter from this  $\theta_{offset}$  variation cannot be rejected by the PLL dynamics, because the PLL locking point is varied in itself in accordance with the supply noise.

If the supply voltage is high, the phenomenon mentioned above is not a critical problem, because  $T_{RD}$  is very short fundamentally and

does not vary significantly with the variation factors.

Figure 2-3 shows the simulation results of  $T_{RD}$  versus the supply voltage of the traditional PFD circuit [14] using a 65-nm CMOS logic process with a typical-typical (TT) process corner. From these results, when the supply voltage is 0.4 V and fluctuates from 0.38 V to 0.42 V (±5 % of 0.4 V),  $T_{RD}$  varies from 3.94 ns to 3.18 ns. When the supply voltage fluctuates from 0.95 V to 1.05 V (±5 % of 1.0 V) for the same PFD circuit,  $T_{RD}$  only varies from 208 ps to 239 ps. Furthermore, process and temperature variations add more noise to this result, and the problem becomes significant.



Figure 2-3. Simulated traditional PFD reset delay vs. supply voltage.

### 2-2. CP Design Consideration

In order to utilize the PFD, the CP plays an important role in a PLL. The CP receives phase error information from the PFD and converts it into a correction current. Later, this current can be low-pass filtered from a loop filter.

A typical CP consists of two current switches, controlled from the UP and DN terminals of a PFD, as shown in Figure 2-4 (a). The UP switch delivers a pump current  $I_{up}$  into a capacitor when the UP terminal of the PFD is active, and the DN switch extracts a pump current  $I_{dn}$  from the capacitor when the DN terminal of the PFD is active. Thus,  $f_{REF}$  leads  $f_{DIV}$ . UP continues to produce pulses and  $V_{CTRL}$  rises steadily, as illustrated in Figure 2-4 (b). A current switch is ideally an open circuit while it is shut OFF, and both  $I_{up}$  and  $I_{dn}$  are the same.



Figure 2-4. (a) Block diagram of PFD with CP, and (b) timing diagram.

In practical design, three types of conventional single-ended CP topologies are used, since they do not need an additional circuitry and offer low-power consumption, as shown in Figure 2-5.



Figure 2-5. Conventional single-ended CP structures: (a) Switch in Drain,(b) Switch in Source, and (c) Switch in Gate.

The first CP in Figure 2-5 (a) has switches at the drain of the current mirror MOS. When the switch is turned off, the current flows from the drain of MN1 to ground on the NMOS side. When the switch is turned on, the drain voltage of MN1 increases from the ground to the loop filter voltage holding by PLL. Meanwhile, the MN1 has to be in the linear region until the voltage at the drain node is higher than the minimum saturation voltage. Consequently, a high peak current is generated during this time even though the charge coupling is not considered. This is caused by the voltage difference of two series resistors from the current mirror, MN1, and the switch. On the other side, the PMOS has the same situation as the NMOS, and the matching of this peak current is very difficult, since the peak current varies with the CP output voltage. Figure 2-5 (b) shows a CP where the switch is placed at the source of the current mirror MOS. In this configuration, the mirrored transistors MP1 and MN1 are in the saturation region all the time. As a result, a very low bias current can be obtained with a high output current.

Figure 2-5 (c) shows a CP where the gate is switched instead of the drain or source. With this topology, the current mirrors circuits are operated in the saturation region. However,  $gm_{p2}$  and  $gm_{n2}$  affects the switching time of this circuit, and the bias current of *MP2* and *MN2* 

may not be scaled down for faster switching operation. The gate capacitance of *MP1* and *MP2* is substantial when the output current of the CP is high enough, and the long channel transistors are used for better matching. To save the constant bias current, the gated bias current can also be employed [15].

For an NTV operation, a CP design is particularly challenging due to its limitation in using multiple-stacked transistors and insufficient overdrive voltages, which make cause significant difficulty in using the above topologies. Moreover, the low supply voltage makes it much harder for the CP to manage non-ideal CP behaviors. Therefore, the following undesirable non-ideal effects should be considered carefully in an NTV region. The next section discusses the non-ideal effects of a CP and their consideration for an NTV operation.

#### 2-2-1. CP Non-ideal Effects

One of the issues in CP design is the leakage current, which might be caused by the CP itself, the on-chip capacitor, or any leakage in the board. Leakage currents as high as several nano-amps are common in sub-micron CMOS technologies. The phase offset by the leakage current is usually negligible but the reference spur level due to the leakage current is substantial in a PLL. The leakage current is a dominant factor not only in process technology, but also in the applied voltage of each transistor [16].

For an NTV CP design, the given supply voltage is much lower than the typical voltage, and generally, all the components receive less than the supply voltage. Therefore, the leakage current issue can be avoided with the appropriate design.

Another effect is the mismatch in the CP, which greatly influences the reference spur. Since CMOS CPs usually have UP and DN switches with PMOS and NMOS, respectively, the current and the switching time mismatch occurs when transferring the charge to the loop filter by the UP and DN operations. When a mismatch is shown in the CP, this can be solved to reduce the turn-on time of the PFD to the minimum pulse width of the output in order to avoid the dead-zone problem.
The amount of phase offset,  $\phi_e$ , due to the current mismatch is given by

$$\left|\phi_{e}\right| = 2\pi \cdot \frac{\Delta t_{on}}{T_{ref}} \cdot \left(\frac{I + \Delta i}{I} - 1\right) = 2\pi \cdot \frac{\Delta t_{on}}{T_{ref}} \cdot \frac{\Delta i}{I}$$
(2-1)

where  $\Delta t_{on}$ ,  $T_{ref}$ , and  $\Delta i$  are the PFD turn-on time, the reference clock period and the current mismatch of the CP ( $\Delta i > 0$  is assumed), respectively. The amount of the reference spur P<sub>r</sub> can be approximately given by

$$\mathbf{P}_{\mathrm{r}} = 20\log\left[\left(\frac{1}{\sqrt{2}}\right) \cdot \frac{f_{BW}}{f_{ref}} \cdot N \cdot \phi_{\varepsilon}\right] - 20\log\left(\frac{f_{ref}}{f_{P1}}\right) \quad [dBc]$$
(2-2)

where  $f_{ref}$ ,  $f_{BW}$ , N, and  $f_{P1}$  are the reference frequency for the PFD, loop bandwidth of the PLL, divider ratio and the pole frequency in the loop filter, respectively. The loop bandwidth  $f_{BW}$  is given by

$$f_{BW} \cong \frac{I_{CP} \cdot R \cdot K_{VCO}}{2\pi \cdot N}$$
(2-3)

where *R* is the loop filter resistor value,  $K_{VCO}$  is the VCO gain, and N is the divider ratio [15].

Equation (2-1) and (2-2) clearly show the importance of designing a PFD with a minimal turn-on time as well as the CP with the minimum mismatches. The reduced PFD turn-on time is also important to reduce the in-band noise of the PLL. However, the PFD turn-on time is

inevitably expanded when the supply voltage is reduced, and the multiple-stacked architectures for reducing current mismatches are rarely adopted for NTV CPs due to their insufficient overdrive voltage. Figure 2-6 shows state-of-the-art single-ended CP structures [17, 18]. Although these provide reduced current mismatch and variation, they cannot operate with NTV supplies due to the voltage headroom limitation.

In figure 2-7, the dynamic-threshold CMOS is used to overcome the voltage headroom problem [19]. However, this requires triple-well CMOS processes and further reduction in supply voltage is not easy.



(a)



Figure 2-6. Schematics of current matching CP (a) with gain boosting circuit [17], and (b) dual-loop compensation circuit [18].



Figure 2-7. Schematic of the DTCMOS CP [19].

Timing mismatches are another non-ideal effect in CPs. In particular, with single-ended CPs, the UP and DN outputs have to drive the PMOS and NMOS switches. Even though the delay from the PFD is well matched, the switch transistors have inherently different threshold voltages, capacitors, and resistors, resulting in varied timing delay. Since delay increases exponentially with supply, timing mismatches can be worse in NTV regions. Figure 2-8 shows that delay variation due to global process variation alone increases by approximately 5 X from around 30% at a nominal operating voltage to as much as 400 % at 400 mV. Operating at this voltage also increase sensitivity to temperature and supply ripples, each of which can add another factor of 2 X to the performance variation, resulting in a total performance uncertainty of 20 X [20].

Finally, charge injection and clock feed-through into a capacitive node  $C_P$  generate phase error to the PLL, as shown in Figure 2-9 (a). Charge injection is a phenomenon that arises due to the remaining charge being injected into a capacitive node during the turn-off of a switch that is connected to that node [21].

When the switches turn off, the charge remaining in the channel is injected into  $C_P$ , and causing a ripple on the output voltage. The CMOS switches couple the clock transitions to the load capacitor  $C_P$  through

its gate-drain parasitic capacitors, which is called clock feed-through. This interference is proportional to the value of  $W \cdot L \cdot C_{ox}$ , where W, L, and  $C_{ox}$  are the width, length, and gate capacitance of the MOSFET, respectively. Moreover, when the UP and DN switches turn on simultaneously, voltage node  $V_X$  rises,  $V_Y$  falls and  $V_{CTRL}$  becomes a balanced voltage. Even if we assume  $C_X$  is equal to  $C_Y$ , the voltage change in  $V_X$  is not equal to the change in  $V_Y$ , and the difference between the two voltage nodes leads to sudden change in  $V_{CTRL}$ . Therefore, a smaller sized MOS switch is recommended in order to reduce the effects of clock feed-through and charge injection.

However, as supply voltage goes down, the CP circuit no longer flows enough current to the output voltage,  $I_{CP}$ , and the size of the transistors must be larger than that of the standard supply voltage.



Figure 2-8. Impact of voltage scaling on gate delay variation [20].



Figure 2-9. Mismatching in the  $C_P$  (a) Error caused by charge injection  $Q_C$ and (b) charge sharing between  $C_P$  and capacitors  $C_{UP}$  and  $C_{DN}$ 

## 2-3. VCO Design Consideration

A frequency controllable oscillator is an essential building block of a PLL. There are several important requirements in VCO design, such as low phase noise, wide tuning range, tuning linearity and low power consumption. In particular, VCO is the dominant source of phase noise in a PLL, and the phase noise performance must inevitably be sacrificed to achieve the other requirements.

Consider a generic oscillator whose output voltage  $v_o(t)$  has a sinusoidal wave shape and a nominal oscillation frequency  $f_o$ 

$$v_{o}(t) = [A + a(t)] \cos \left\{ 2\pi \cdot f_{0} \cdot t + \phi(t) \right\}$$
 (2-4)

where A is the mean value of the oscillator output amplitude, a(t) is the zero-mean amplitude noise, and  $\phi(t)$  contains all phase and frequency departures from the nominal oscillation frequency  $f_o$  and phase  $2\pi \cdot f_0 \cdot t$ . The phase disturbance  $\phi(t)$  includes the random zeromean phase noise, initial phase, and integrated effects of frequency offset and drift [22].

In order to reduce the phase noise of a PLL, LC-type VCOs offer better noise performance. However, they are dependent on passive resonant components, which increase the size of the design. In particular, in an NTV supply region, the overall oscillation frequency target is lower than the nominal supply voltage, and the size of inductors are much larger than the frequency target of several GHz. Hence, ring-type VCOs are suitable for on-chip integration with reasonable phase noise.

There are several challenges in designing ring-type NTV VCOs. One is the limitation to use multiple-stacked transistors, which reduces the overdrive voltage of the transistors and restricts the VCO output frequency range. Another is the PVT variation dependency of the VCO. Since the VCO is very sensitive to PVT variations, the frequency range generated by the VCO has a large variation. In addition, VCO components shows exponential nonlinear V-I characteristic curves, which makes it very hard for the VCO to control the target frequency.

One of the promising solutions to this nonlinear controllability is the body bias technique [23].

If the PMOS transistors are operated in the saturation region,

$$V_{SD} > V_{SG} - \left| V_{THP} \right| \tag{2-5}$$

The source-drain current,  $I_D$  is described by the following equation.

$$I_D = \frac{W}{2L} \mu_p C_{ox} \cdot \left( V_{SG} - \left| V_{THP} \right| \right)^2 (1 + \lambda V_{SD})$$
(2-6)

where W/L is a PMOS width and length ratio,  $\mu_p$  is effective mobility of holes, and  $C_{ox}$  is gate capacitance per unit area. The threshold voltage  $V_{THP}$  is given

$$V_{THP} = V_{THP0} - \gamma \cdot \left(\sqrt{2\Phi_F - V_{SB}} - \sqrt{2\Phi_F}\right)$$
(2-7)

where  $\gamma$  is the body factor coefficient and  $2\Phi_F$  is the Fermi potential. By applying the equation (2-7) to the equation (2-6), the  $I_D$  controlled by the threshold voltage is

$$I_{D} = \frac{W}{2L} \mu_{p} C_{ox} \cdot \left( V_{SG} - \left| V_{THP0} - \gamma \cdot \left( \sqrt{2\Phi_{F} - V_{SB}} - \sqrt{2\Phi_{F}} \right) \right| \right)^{2} (1 + \lambda V_{SD})$$

$$(2-8)$$

From the equations (2-6) and (2-8),  $I_D$  is proportional to the square of  $V_{SG}$  and linearly proportional to  $V_{SB}$ .

$$I_D \propto V_{SG}^{2}, \quad I_D \propto V_{SB} \tag{2-9}$$

Figure 2-10 shows the linearity of V-I characteristic curves for PMOS transistors with (a) conventional gate-controlled PMOS and (b) bodybias controlled PMOS. This idea generates many possibilities for designing NTV VCOs as well as other sub-blocks. A new implementation and solution for body-bias VCO is discussed in Chapter 3.



Figure 2-10. V-I characteristic curve for PMOS transistor.

(a) Conventional gate-controlled PMOS and (b) body-bias controlled PMOS.

## 2-4. FD Design Consideration

An FD, also called a prescaler, is a circuit that takes a frequency input signal and generates a divided-output signal of a frequency. The FD has to divide the incoming VCO signal to compare it with the reference frequency.

There are two types of FDs: analog dividers and digital dividers. Analog dividers, such as regenerative dividers [24] and injected-locked FDs [25] have been useful in narrow frequency ranges, and therefore, they have only a fixed division ratio and are not programmable. They also require extra circuits for proper dividing operation.

Digital dividers cannot work at frequencies as high as those possible with analog FD, but they can provide very large division ratios, and they allow output at many different frequencies because they are built with digital circuits and easily programmable. Moreover, digital FD do not require tuned circuits or filters, so that they are easily incorporated into IC designs. For this reason, digital dividers are by far the most popular FDs in PLLs.

NAND-based D-flip-flop (D-FF) and true single-phase-clock (TSPC) DFF [26] are good examples of digital FDs. A D-FF FD is formed by connecting two gated D-latches together and inverting the enable input to one of them. To make a "divide-by-2" FD, the inverted output signal  $\overline{Q}$  is connected directly back to the data input port, giving the device "feedback," as shown in Figure 2-11.

The TSPC FF has two operation modes: hold mode and evaluation mode, as shown in Figure 2-12. In hold mode, also called pre-charge mode, CLK is "0," and node S1 is pre-charged to a certain value due to the input signal, and node S2 goes to VDD. The output node becomes floating as the M7 and M8 transistors are turned off. In evaluation mode, CLK is "1," and if node S1 is pre-charged to VDD, then node S2 is discharged and the MP4 transistor pulls up the output node. If node S1 is pre-charged to "0," then node S2 is not discharged, and the MN4 and MN5 transistors pull down the output node. The timing diagram of TSPC FF is illustrated in Figure 2-13. The gray regions indicate the logic has floating nodes when it operates.

Although digital dividers work well under low supply voltages, NAND-based D-FF, Transmission gate D-FF, and TSPC do not have sufficient frequency range to divide the maximum VCO frequency, especially in the NTV region. Therefore, an expended-TSPC (E-TSPC) DFF [27] can be an alternative solution. E-TSPC increases its operating frequency by reducing the stacked transistors, as shown in Figure 2-14. However, this logic circuit has disadvantages due to the pseudo NMOS logic circuit. The E-TSPC circuit also has a floating node when it operates, resulting in a static current problem which is not likely to operate TSPC. Figure 2-15 illustrates the operation of a divide-by-2 E-TSPC circuit. The gray part indicates the duration where short circuit current exist in the operation.

Fortunately, however, this short circuit current does not affect much in an NTV supply compare with a nominal supply operation. Therefore, E-TSPC is a good solution for NTV frequency divider.



Figure 2-11. (a) Schematic of a D-FF FD and (b) its timing diagram



Figure 2-12. Operation of TSPC flip-flop (a) hold mode and (b) evaluation mode



Figure 2-13. Timing diagram of TSPC FD



Figure 2-14. Schematic of E-TSPC FD



Figure 2-15. Timing diagram of E-TSPC FD

## 2-5. Summary

This chapter described basic operations and design considerations of the PLL building blocks. It also indicated several challenges that are created when the supply voltage is reduced as low as the NTV. In this supply region, internal delay variation is increased fivefold compared to with the nominal supply voltage, and multiple-stacked circuit topologies cannot be used for analog circuits due to the voltage headroom limitation. Consequently, new architecture and circuit techniques should be considered for the NTV PLL.

# **3. A 500 MHz Ultra-Low Power PLL for NTV operation.**

## **3-1. Introduction**

The demands for enhanced energy efficiency and design consideration of each building blocks of a PLL have been presented in the earlier chapter. To reduce power consumption, several PLLs operating at 0.5-V supply voltage have been reported [19, 23, 28, 29]. However, their power efficiencies which defined as the power consumption per PLL output frequency are still larger than 1 mW/GHz. Therefore, it is strongly required that further investigation into circuit techniques for reducing PLL power and increasing PLL output frequency.

A fully integrated NTV PLL realized in 65-nm CMOS technology is introduced in this chapter. The proposed PLL successfully achieves 500-MHz operation with power consumption of 127.8  $\mu$ W at 0.4-V supply voltage, which corresponds to record low power efficiency of 0.256 mW/GHz [30]. The following section introduces a novel NTV CP structure which provides good UP and DN current matching characteristics as well as almost flat CP current variation across the VCO control voltages.

We have also discussed NTV operation VCO that each sub-band of VCO is controlled with the body-bias voltage so that the VCO gain  $(K_{VCO})$  is very small and linear, resulting in improved phase noise characteristics. Moreover, an integrated Automatic Frequency Calibration (AFC) circuit has been discussed. This AFC provides several VCO sub-bands spanning wide output frequency range so that the VCO can generate target output frequency even with PVT variation.

## **3-2. Proposed NTV Supply PLL Architecture**

One of the most important targets before implementing PLL is to select the power-efficient supply voltage for a given 65-nm CMOS technology.

For this, we design a simple inverter-type ring-based VCO with FD and analyze power efficiencies on voltage scales. Since there is no supply limited circuit, it seems lower supply provides better power efficiency as shown in Figure 3-1 (a). However, VCO period and power consumption on voltage scales shows trade-off voltages around 0.4-V supply as shown in Figure 3-1 (b). Therefore, 0.4-V supply seems proper choice for our NTV design.

Proposed PLL architecture is shown in Figure 3-2. It consists of a pass-gate type PFD, a newly proposed CP and a second-order on-chip passive loop filter. An AFC circuit provides VCO coarse tuning voltage. The PLL has a divide-by-16 FD consisting of combination of an ETSPC and TSPCs for faster operating with low power consumption.



Figure 3-1. (a) Impact of power efficiencies on voltage scales, and (b) VCO period and power consumption on voltage scales for simple VCO circuit.



Figure 3-2. Proposed PLL architecture

## 3-3. NTV PFD Design.

Because the PFD employs digital logic circuit, it seems that the PFD works simply "Digital" way. However, The PFD output information is contained in the widths of the UP and DN pulses, which are continuously variable analog quantities. Therefore, PFDs are carefully considered for right operation under NTV region.

The main design point of the PFD is reducing the dead zone property. The dead zone occurs when the two input clocks are very close to each other (small phase error), due to the delay time the reset delay ( $T_{RD}$ ), the output signals UP and DN will not be able to turn on the switches of the CP and no output signal generates that leading to losing this small phase difference. This phenomenon introduces jitter to the PLL system. Although, a long  $T_{RD}$  can reduce dead zone problem, this long delay pulses causes current spikes or current mismatches of the CP circuit that causes reference spurs to the PLL output. For this reason, PFD design needs to consider the delay length of the PFD.

For our PLL, we prefer to choose an optimum NTV PFD circuit rather than design a new PFD circuit. Therefore, several PFD structures have been investigated for NTV operation. Figure 3-3 shows several widelyused PFD structures. Figure 3-3 (a) shows a traditional PFD based on NAND latches [14]. Figure 3-3 (b) shows another type of NAND latchbased PFD [31]. Figure 3-3 (c) and (d) show two PFDs which are made up of two input NOR and NAND-type D-FFs, respectively [32]. Finally, to reduce the PLL frequency acquisition time, a PFD based on glitch latches and a PFD based on pass-transistor-type D-FFs are investigated as shown in Figure 3-3 (e) and (f), respectively [33].

To select the optimized PFD architecture, each PFD is designed using 65-nm logic-CMOS technology and simulated using H-Spice simulator. The reset delay of each PFD is properly adjusted and minimized until it doesn't show any dead-zone problem.

During the simulation, two rectangular pulses having same phase are given and the average width of UP and DN pulses is measured, which indicates the  $T_{RD}$ . Simulated  $T_{RD}$  of each PFD with the supply voltage varied from -5 % to + 5% of supply, which is form 0.38V to 0.42V and the process corner varied from fast-fast (FF) to slow-slow (SS) corners are shown in Figure 3-4.

According to the simulated data, fundamental  $T_{RD}$  when supply voltage is 0.4 V and  $T_{RD}$  variation due to the process variation are illustrated in Figure 3-5. From these results, Figure 3-3 (f) PFD based on pass-transistor type D-FF has smallest reset delay and smallest variation for all process corners. The summarized simulation results are shown in Table 3-1.











Figure 3-3: Schematics of six PFD structures. (a) NAND latch1, (b) NAND latch2, (c) NOR D-FF, (d) NAND D-FF, (e) Glitch latch D-FF, (f) Pass-transistor D-FF.



Figure 3-4: Simulated reset delay  $(T_{RD})$  vs. supply voltage for each PFD with (a) SS corner, (b) TT corner, (c) FF corner.



Figure 3-5. Simulated  $T_{\text{RD}}$  at 0.4 V supply, and (b)  $T_{\text{RD}}$  process variation.

| Туре                  | FF    | ТТ    | SS    | Variation |
|-----------------------|-------|-------|-------|-----------|
| (a) NAND LATCH 1      | 0.453 | 1.133 | 3.268 | 2.815     |
| (b) NAND LATCH 2      | 0.255 | 0.63  | 1.789 | 1.534     |
| (c) NOR D-FF          | 0.489 | 1.107 | 2.807 | 2.318     |
| (d) NAND D-FF         | 0.511 | 1.176 | 3.079 | 2.568     |
| (e) Glitch latch D-FF | 0.31  | 0.716 | 2.222 | 1.912     |
| (f) Pass-Tr. D-FF     | 0.255 | 0.594 | 1.594 | 1.339     |

TABLE 3-1

Reset delay  $\left(T_{RD}\right)$  Simulation Summary

#### **3-4. Mismatch and Variation Tolerant CP**

NTV CP design is particularly challenging due to its limitation in using multiple-stacked transistors and pronounced non-ideal CP behaviors at NTV as we mentioned earlier. CP up and down switching current mismatch increases the phase offset and reference spur level and their variation with VCO control voltages results in PLL bandwidth fluctuation.

In order to address these problems, several single-ended CP structures have been investigated previously [17-19], and differential CP also has been reported [34-35] which can effectively eliminate CP current mismatch but they require multiple stacked transistors with additional compensating circuits for differential operation which consume a large amount of power.

These problems can be solved with a proposed CP structure shown in Figure 3-6. It has a gate-switching structure so that the voltage headroom problem can be avoided. The OP-Amp output controls the body bias of PMOS transistors,  $M_{P1}$  and  $M_{P2}$ , with a feedback, resulting in matching UP/DN current. Moreover, in order to reduce current variation without sacrificing VCO's control range, a gain-boosting technique is used to increase the CP output resistance.



Figure 3-6. Proposed CP structure.

## **3-4-1. Proposed CP Analysis**

Proposed CP can be operated into two different techniques. One is current mismatch compensation technique, and the other is current variation compensation technique.

The proposed current mismatch compensation circuit and its simplified circuit is illustrated in Figure 3-7.

We assume that there is no leakage path from the voltage node  $V_{CTRL}$ and  $V_{REF}$ , and UP/DN switches are ideal. If the Path#1 and Path#2 are ideally disconnected at the Figure 3-7 (b), then the output voltage node  $V_{CTRL}$  can be expressed as,

$$V_{CTRL} = V_{DD} \cdot \left(\frac{1}{1 + (R_{P_1} / R_{N_1})}\right) = V_{DD} \cdot \left(\frac{R_{N_1}}{R_{P_1} + R_{N_1}}\right)$$
(3-1)

In this case, if the  $R_{NI}$  is equal to the  $R_{PI}$ , then the  $V_{CTRL}$  is equal to  $1/2 V_{DD}$  resulting in same UP and DN current.

$$I_{DN} = I_{UP} \tag{3-2}$$

If the  $V_{CTRL}$  drops then the  $I_{DN}$  decreases and the  $I_{UP}$  increases, because of fixed  $R_{NI}$  and  $R_{PI}$  resulting in,

$$I_{DN} < I_{UP} \tag{3-3}$$

On the other hand, if the  $V_{CTRL}$  increases then the  $I_{DN}$  increases and  $I_{UP}$  decreases.

$$I_{DN} > I_{UP} \tag{3-4}$$

According to the equations (3-3) and (3-4), the UP and DN current cannot be same when the  $V_{CTRL}$  varies.

To minimize these current mismatches we need a replica feedback circuit, Path#2, connected to the Path#1. If any occurring  $V_{CTRL}$  changes, the OP-amp. senses and drives the input error,  $V_{ERR}$ , as close as possible to zero ( $V_{CTRL} = V_{REF}$ ) by adjusting the variable resistor  $R_{P2}$  to keep same resistance as  $R_{N2}$ .

For  $R_{P2} = R_{P1}$  and  $R_{N2} = R_{N1}$ , if the DN switch is high and UP signal is low, then the  $I_{DN}$  decreases. The OP-amp drives the  $R_{P2}$  to decrease the current  $I_{RP2}$ , and therefore  $I_{RN2}$  also decreases resulting in dropping the voltage node  $V_{REF}$  until  $V_{REF}=V_{CTRL}$ .

$$I_{RN2} = I_{RP2} = I_{DN}$$
(3-5)

On the contrary, if the  $V_{CTRL}$  increases by making UP switch is high and DN switch is low, then the  $I_{UP}$  decreases. The OP-amp drives both the  $R_{P2}$  and the  $R_{P1}$  to increase the current  $I_{RP2}$  and  $I_{UP}$  until  $V_{REF}$  goes up to  $V_{CTRL}$ .

$$I_{RN2} = I_{RP2} = I_{UP}$$
(3-6)

Consequently, the charging current  $I_{UP}$  always follows the discharging current  $I_{DN}$  regardless of changing the  $V_{CTRL}$ .



Figure 3-7. (a) CP current mismatch compensation circuit, and (b) Simplified circuit.

This gate-switching architecture inherently has small output resistance compare with multiple stacked one, which means the CP current variation across the VCO control voltage is larger than conventional stacked architecture. Therefore, the other circuit technique for current variation is needed.

Figure 3-8 shows (a) NMOS transistor, and (b) I-V characteristics with output impedance,  $R_{OUT}$ . This clearly shows that increasing output impedance reduces the output current variation across the wide output voltage variation.



Figure 3-8. (a) NMOS transistor, and (b) I-V characteristics with  $R_{OUT}$ .

Figure 3-9 shows increasing the output resistance by adding the gain boosting technique. Consider the NMOS cascode in Figure 3-9 (a), whose output impedance is given by,

$$R_{OUT} = g_{mN2} r_{oN2} r_{oN1}$$
(3-7)

where  $g_{mN2}$  is trans-conductance of  $M_{N2}$ ,  $r_{ON2}$  and  $r_{ON1}$  is output resistance of transistor  $M_{N2}$  and  $M_{N1}$ , respectively. The small voltage change across node X is proportional to  $I_{OUT}$ , and this voltage can be subtracted from the gate voltage of  $M_{N2}$  in current-voltage feedback as shown in Figure 3-9 (b). By doing this, output impedance can be increased as,

$$R_{OUT} \approx A \cdot (g_{mN2} r_{oN2} r_{oN1}) \tag{3-8}$$

A circuit implementation of this gain boosting circuit is illustrated in Figure 3-9 (c). An inverter is used as a gain amplifier, which detects the voltage node X and regulates the gate voltages of  $M_{N2}$ .

With this, the current through  $M_{N2}$  remains constant, resulting in the large resistance for the drain of  $M_{N2}$  looked from the voltage node  $V_{OUT}$ , which can be expressed as

$$R_{OUT} = (g_{mN2}r_{ON2}r_{ON1}) \cdot ((g_{mP3} + g_{mN3}) \cdot (r_{OP3} \Box r_{ON3})), \qquad (3-9)$$

where  $g_{mN2}$ ,  $g_{mP3}$ , and  $g_{mN3}$  are the trans-conductance of transistor  $M_{N2}$ ,  $M_{P3}$ , and  $M_{N3}$ , respectively, and  $r_{ON2}$ ,  $r_{ON1}$ ,  $r_{OP3}$ , and  $r_{ON3}$  are the output resistance of  $M_{N2}$ ,  $M_{N1}$ ,  $M_{P3}$ , and  $M_{N3}$ , respectively.

This technique provides the same resistance as a triple-cascode transistor but without sacrificing any voltage headroom.



Figure 3-9. Increasing the output resistance by adding the gain boosting technique (a) NMOS cascode circuit, (b) gain-boosting circuit and (c) gain circuit implementation.

With this idea, to increase the output impedance of the proposed current mismatch compensation CP can be changed as shown in Figure 3-10 (a).

Since we are using an OP-amp. for current matching characteristics, the voltage node  $V_{REF}$  and  $V_{CTRL}$  are same as long as the amplifier maintains enough gain. For  $R_{N2} = R_{N1}$  and  $M_{N4} = M_{N5}$ , the  $V_{CTRL}$ NMOS path is same as the  $V_{REF}$  NMOS path. Therefore, the compensated gate voltage of  $M_{N4}$  is copied to the gate voltage of  $M_{N5}$  as shown in Figure 3-10 (b), which controls the discharging current  $I_{DN}$  in the same manner.

Figure 3-11 (a) shows the simulation results of CPs without any compensation [15], (b) with proposed current mismatch compensation technique only, and (c) with the mismatch and variation compensation technique, respectively.

The target current is 50  $\mu$ A with 0.4-V supply in 65-nm CMOS. It clearly shows the compensation technique greatly reduces CP current mismatch and its variation over a wide range of control voltages.


Figure 3-10. Proposed mismatch and variation compensation CP. (a) gainboosting implementation for nodes,  $V_{REF}$  and  $V_{CTRL}$ , and (b) final gainboosting circuit implementation with  $V_{REF}$  only.



Figure 3-11. Simulated CP currents (a) without compensation, (b) with current mismatch compensation only and (c) with current mismatch and variation compensation at 0.4-V supply.

### 3-5. NTV VCO Analysis.

LC-VCO provides better noise rejection and many advantages as mentioned previous chapter. However it requires a much larger chip area, especially VCOs for an oscillation frequency are less than 1GHz, limiting its applicability. Hence, a ring-based VCO is designed for our PLL.

For a ring oscillator with N stages of CMOS inverter delay cells the frequency of oscillation,  $f_o$ , is given by

$$f_{o} = \frac{1}{2Nt_{d}} = \eta \frac{I_{D}}{NC_{L}V_{DD}},$$
(3-10)

where  $t_d$  is the average propagation delay of the inverter delay cell,  $\eta$  is a proportionality constant less than 1,  $C_L$  is the load capacitance of each delay cell, and  $I_D$  is the peak of charging current passing through the PMOS transistor to charge  $C_L$  [36].

Generally, VCO transistors are working in its saturation region with nominal VDD condition. For VDD is near threshold region, the transistors operate weak inversion region. Therefore,  $I_D$  can be described as

$$I_{D} = I_{o} \cdot e^{(V_{DD} - V_{T})/n\phi_{1}}$$
(3-11)

Where  $\phi_1$  is the thermal voltage and  $I_o$  can be determined experimentally. VCO power consumption *P* [37] can be described as

$$P = N \cdot f_o \cdot C_L \cdot V_{DD}^2 + I_{Leak} \cdot V_{DD}$$
(3-12)

where *N* is number of delay cell,  $C_L$  is load capacitance between delay cells, and  $I_{Leak}$  is static leakage current. From those equations (3-10) and (3-12), the oscillation frequency and power consumption are function of  $V_{DD}$  and  $I_D$ .

To reduce the power consumption of VCO, the supply voltage,  $V_{DD}$  and Leakage current,  $I_{Leak}$  should be reduced simultaneously. Although, the leakage current can be reduced as supply voltage is scaled down,  $V_{DD}$  cannot be reduced easily due to circuit's operating limitations.

Therefore, there are two big challenges in designing a ring-based VCO with NTV supplies, which are limitation in using multiplestacked transistors that restricts the VCO output frequency range, and the frequency dependence on PVT variation.

Since PVT variation has a direct impact on the oscillation frequency, the VCO should provide a wide tuning range, resulting in large  $K_{VCO}$ . We solved these problems by designing the VCO with body-bias technique and adding an Automatic Frequency Calibration circuit.

Although single-ended VCO consumes lower power consumption, differential VCO shows better phase noise performance.

Therefore, a pseudo-differential multi-band VCO shown in Figure 3-12 is employed in our design. The delay cell consists of two inverters with the cross-coupled pair [28].

The coarse tuning is achieved by adjusting the gate voltage of PMOS transistors  $M_{P1}$ , and  $M_{P2}$ . The gate voltage,  $V_{COARSE}$ , is provided from digitally controlled AFC to selecting an appropriate VCO sub-band. A tunable PMOS body bias,  $V_{CTRL}$ , from the CP is used for fine tuning the VCO frequency. In this way, the VCO obtains very low *Kvco* for wide range of frequency control for various PVT conditions.



Figure 3-12. Proposed VCO circuit and its delay cells.

#### 3-6. Dual-Loop AFC.

Several calibration techniques have been discussed, and used for the PLL [38, 39]. There are two ways of calibrating the VCO in a PLL operation: one is applying the digital control codes externally, and the other is automatically applying the codes with a calibration circuit. Although the external control provides best code for a desired operation, it needs additional test time and budget. Therefore, on-chip automatic-calibration circuit is implemented for out PLL.

Figure 3-13 shows the structure of proposed AFC circuit, which consists of a frequency comparator, UP/DN code counter and a digital to analog converter (DAC) and a lock detector.

The flow chart of the AFC operation is shown in Figure 3-14. Initially, the PLL loop is open and  $V_{CTRL}$  is connected to the half of the supply. Then, the AFC circuit compares two input signals, CLK\_REF and CLK\_DIV, and generates either UP or DN signals. With these, the UP/DN code counter generates a 4-bit code, which after going through a DAC, coarsely sets the VCO frequency, selecting a desired VCO subband. Then, the lock detector makes the counter to stop counting and save the code. After that, the PLL automatically reconnects V<sub>CTRL</sub> to the PLL loop and begins its phase locking process.

Figure 3-15 shows simulated result of (a) the digital output of the

proposed AFC circuit, and (b) VCO control signal,  $V_{CTRL}$ . The AFC is stabilized to the value 0011 showing that the digital calibration part works perfectly.

The digital circuits for the AFC building block occupies relatively large chip area, because all blocks are custom designed and laid out for proper operation with NTV supply.



Figure 3-13. Proposed AFC circuit building blocks.



Figure 3-14. Dual-loop AFC flow chart



Figure 3-15. Simulated result of (a) the digital output of the proposed AFC circuit, and (b) VCO control signal,  $V_{CTRL}$ .

### **3-7. TSPC and E-TSPC FD Analysis.**

The CMOS FDs such as D-FF FDs, and transmission-gated FDs, eliminate the static current consumption and they use fewer transistors than the CML FDs. They are slow because of the positive feedback loop that occur additional delay to the switching time. There are two factors which limit the switching speed of static circuits. One is the current conduction level of CMOS transistors and the other is parasitic capacitances. Since the parasitic capacitances are not controllable, using the advanced process technology is the only way of increasing switching speed.

To overcome this problem, dynamic logic circuits, such as TSPC and E-TSPC, are developed which uses the RC parasitic as integral parts of the circuit. Although the TSPC and E-TSPC FDs are suitable in low power dividers especially for target frequencies are below several GHz applications, the choice between the circuits is very crucial since the power consumption and operating frequency is critical factor within NTV supply.

The E-TSPC FD uses two transistors while a TSPC FD uses three stacked transistors, and two circuits with their equivalent RC models are shown in Figure 3-16.



 $CLK \circ O = M_{N1} R_{CLK}$ 

Figure 3-16. Equivalent RC model for (a) TSPC and (b) E-TSPC.

The time constants for charging and discharging of the first stage of the TSPC FD are calculated using the Elmore RC ladder method described in [40], as follows

$$\tau_{N1} = R_{D1} \cdot C_{OUT} \tag{3-13}$$

$$C_D = C_{DBN1} + 2C_{GDN1} + C_{DBP2} + 2C_{GDP2} + C_{fan}$$
(3-14)

$$\tau_{P1} = (R_{D1} + R_{CLK}) \cdot C_{OUT} + R_D \cdot C_D$$
(3-15)

$$C_D = C_{GSP2} + C_{GSP1} + C_{P+} \tag{3-16}$$

where  $C_{P_+}$  is the total depletion capacitance between the series PMOS transistors and  $C_{fan}$  is the fan-out capacitance.

The propagation delay of the first stage TSPC FD is given by,

$$\tau_{N1} = R_{CLK} \cdot C_{OUT} \tag{3-17}$$

$$\tau_{P1} = R_D \cdot C_{OUT} \tag{3-18}$$

$$t_{p}|_{TSPC} = \frac{0.69(\tau_{P1} + \tau_{N1})}{2}$$
(3-19)

In same way, the first stage E-TSPC propagation delay is also given by

$$\tau_{N1} = R_{CLK} \cdot C_{OUT} \tag{3-20}$$

$$\tau_{P1} = R_D \cdot C_{OUT} \tag{3-21}$$

$$t_p \mid_{E-TSPC} = \frac{0.69(\tau_{P1} + \tau_{N1})}{2}$$
(3-22)

Practically, the output load capacitance of E-TSPC stage is smaller than that of TSPC due to the reduced fan-out, and the propagation delay for the TSPC is higher than delay of the E-TSPC. Therefore, The TSPC FD has a lower operating frequency than the E-TSPC FD.

However, the output capacitance of the TSPC is higher than the E-TSPC, one of transistor at each stage of TSPC is always turned off, resulting in no short circuit current, which means TSPC FD consumes less power than E-TSPC FD.

Table 3-2 shows the simulation result of maximum operation frequency and peak to peak jitter when forcibly induced 10% of 1-MHz VDD variation for the various digital FDs at 65nm CMOS process technology. Their minimum operation frequencies are not compared here, because their minimum frequencies are 10 - 20 megahertz.

From this result, out PLL is consisting of combination of an E-TSPC FD and three stages of TSPC FDs for faster operating with low power consumption.

# TABLE 3-2

## Maximum FD operation frequency

|                 | NAND | TG   | TSPC | E-TSPC |
|-----------------|------|------|------|--------|
| SSHT (MHz)      | 120  | 180  | 300  | 600    |
| FFCT (MHz)      | 800  | 1200 | 1200 | 3000   |
| P-P Jitter (ps) | 465  | 380  | 215  | 76     |

SSHT : SS corner,  $80 \degree C$ . FFCT : FF corner,  $-25 \degree C$ .

### **3-8. Measurement Results.**

Proposed PLL is fabricated in 65-nm standard CMOS technology and mounted on FR4 board for measurement. Figure 3-17 shows the measurement board for testing and Figure 3-18 shows the measurement setup.

31.25-MHz reference frequency is provided by a signal generator and the PLL output is measured by an oscilloscope and a spectrum analyzer.

Figure 3-19 shows a (a) layout and (b) die micro-photograph. The total area is about  $0.0285 \text{ mm}^2$  excluding the output buffer.

Figure 3-20 shows the measured VCO frequencies for three different VCO sub-bands. For this measurement, forced the AFC codes and external voltage is supplied to the VCO input. As can be seen in the figure, our VCO provides small and constant gain for a wide frequency range. In order to verify that our CP currents do not change much with VCO control voltages, the PLL reference frequency is changed from 26.875 MHz to 37.5 MHz with corresponding PLL output frequency from 430 MHz to 600 MHz. This causes the change of  $V_{CTRL}$  within in the same VCO sub-band. Then PLL loop bandwidth, which is directly influenced by the CP current, is measured.



Figure 3-17. Test board for measurement



Figure 3-18. Measurement setup.







(b)

Figure 3-19. Proposed PLL (a) layout and (b) die micro-photograph.



Figure 3-20. Measured VCO frequencies for sub-bands selected with different AFC codes [0001], [0011], and [0111].



Figure 3-21. Measured and simulated PLL loop bandwidth for different VCO control voltages.

Figure 3-21 shows the measurement results as well as the simulated results for a PLL without any CP compensation. The results clearly show that our PLL has very small CP current variation for a wide tuning range.

Measured phase noise is well correlated with simulation result as shown in Figure 3-22. Measured and simulated PLL loop bandwidth is around 2.5 MHz.

Measured PLL output spectrum is shown in Figure 3-23. The spur an offset frequency of 31.25 MHz is -59 dB below the carrier frequency. The measured PLL phase noise is -94 dBc at 1-MHz offset for 500-MHz output frequency.

Figure 3-24 shows the measured jitter characteristics also at 500 MHz. The RMS jitter is 16.9 ps corresponding to 0.0084 unit interval. The PLL consumes 127.8 W excluding the output buffer, resulting in power efficiency of 0.256 mW/GHz.



Figure 3-22. Phase noise correlated with (a) simulation and (b) measurement result.



Figure 3-23. Measured output spectrum at 500 MHz.



Figure 3-24. Measured output jitter at 500 MHz.

### 3-9. Summary.

A 0.4 V of NTV PLL is demonstrated in standard 65-nm CMOS technology. The newly proposed CP in our PLL successfully reduces CP current mismatch and variation. The measurement results shows that the automatically calibrated VCO with a body-bias technique provide small VCO gain for a wide range of output frequency.

At the target output frequency of 500 MHz, our PLL also achieves power efficiency of 0.256 mW/GHz.

# 4. NTV PLLs Application Extension

## **4-1. Introduction**

For a low-voltage charge pump PLL (CPPLL), ideally there is no mismatch between charging and discharging currents at a CP when it is locked condition. But in real implementations, there still exists nonidealities like the current mismatch and leakage current problem at the CP output which modulates the VCO control voltage and appears as reference spurs in the output spectrum of a PLL,

In this chapter, we extend our research to demonstrate a new CPPLL architecture which overcomes the non-idealities of CP and its effects on the PLL performance.

## 4-2. A 90~ 350 MHz NTV PLL with ALF CP

In an effort to further improve PLL power efficiency, we investigate a new NTV PLL architecture in 65-nm CMOS technology. In particular, we propose a novel CP structure suitable for NTV operation with good current matching characteristics. The prototype CPPLL achieves 90~350-MHz operation with 0.4-V supply voltage with power efficiency of 0.31 mW/GHz at 350 MHz [41].

## 4-2-1. Proposed PLL Architecture

Figure 4-1 shows another application expansion for an ALF CPPLL architecture. It consists of a conventional PFD, a CP with an ALF, a pseudo differential ring-based VCO and an AFC circuit. It also includes divide-by-16 FD.



Figure 4-1. Proposed ALF CPPLL architecture.

### 4-2-2. Charge Pump with Active Loop Filter

CMOS CPs usually have PMOS and NMOS switches which cannot produce perfect current matching characteristics due to output impedance mismatches and capacitance mismatches.

Several differential CPs with ALF have been reported [42-44] which can effectively eliminate mismatches in CP, but they require multiple stacked transistors with additional compensating circuits for differential operation which consume a large amount of power.

However, using an ALF provides several advantages, such as wide CP tuning range and perfect isolation form CP output node to VCO input node. Therefore, a novel CP structure with active loop filter is proposed as shown in Figure 4-2.



Figure 4-2. Proposed CP with ALF.

The fixed bias voltages (pbias, nbias) can cause significant mismatch between UP and DN currents depending on CP output voltage,  $V_{CP}$ , as well as PVT variation as shown in Figure 4-3. However, if  $V_{CP}$  is fixed to a specific value that balances UP and DN currents as marked in Figure 4-3, current mismatch can be eliminated, which can be achieved with an ALF.

With the ALF,  $V_{CP}$  is isolated from VCO control voltage,  $V_{CTRL}$ , and follows the reference voltage. The CP output current path (Path #1) is duplicated to the replica path (Path #2) and the same nbias and pbias voltages are applied to both paths. Because the switching CMOS in Path #2 ( $M_{P2}$ ,  $M_{N2}$ ) are always turned on,  $V_{REF}$  is at the value which makes the current through  $M_{P2}$  and  $M_{N2}$  identical. With the OP-Amp,  $V_{CP}$  follows  $V_{REF}$  and finally, UP and DN currents in Path #1 are matched regardless of PVT variation. We designed these paths with transistors having exactly same sizes and did very careful layout so that the reference spur due to mismatch between path 1 and path 2 can be minimized. Figure 4-4 (a) shows simulation results of CP output currents and Figure 4-4 (b) shows the dependence of  $V_{CP}$  and  $V_{REF}$  on NMOS bias current variation for different process corners. Even if we force extra NMOS bias currents to cause mismatch, UP and DN currents are same and  $V_{CP}$  is equal to  $V_{REF}$  as long as the OP-Amp has

enough gain. The OP-Amp has a configuration of two-stage NMOS mirror so that the rail-to-rail operation can be achieved without having voltage headroom problem as shown in Figure 4-5.

CP gate switching also affects the reference spur. Due to parasitic capacitance of switches and bias transistors, the clock feed-through generates peak currents for rising and falling edges during up/down switching activities. We carefully determined charge pump transistor sizes and the output slope of PDF to minimize unwanted switching feed-through.

The proposed architecture has several advantages. First, the OP-Amp does not need to have as high a slew rate as in Figure 2-6 (b), resulting in power reduction. Although the CP output pulses can have high-frequency components, they can be filtered by a low-pass filter before the OP-Amp and, consequently, the OP-Amp can be relieved from the high slew-rate requirement. Second, the VCO control voltage,  $V_{CTRL}$ , can swing the full supply range regardless of CP output voltage,  $V_{CP}$ . In the case of the PLL based on a passive loop filter,  $V_{CP}$  is equal to  $V_{CTRL}$ . Thus, if CP provides a narrow voltage range, then the VCO output range is also restricted.



Figure 4-3. Simulated CP output current against CP output voltage for different process corners.



Figure 4-4. Simulation of (a) CP output current, and (b)  $V_{CP}$  and  $V_{REF}$  voltage against NMOS bias current variation for different process corners.



Figure 4-5. Two stage NMOS mirrored-OTA circuit

### 4-2-3. VCO with AFC

We employ 2-stacked NMOS feedback delay cells for our VCO that are used for our previous PLL as shown in Figure 4-6. In addition, the VCO has been implemented with 4-stages of delay cells that generate 4-different-phases of clocks. That makes possible to the PLL can be used for multi-purposed usage of clock generation.

We also implement an AFC circuit which includes a resistor-ladder network DAC for low-power dissipation after PLL is locked as shown in Figure 4-6.

The VCO receives two input signals,  $V_{CTRL}$  and  $V_{AFC}$ .  $V_{CTRL}$  is the fine-tuning signal from ALF, and  $V_{AFC}$  is the digitally-controlled coarse-tuning signal from AFC. The AFC block compares  $V_{CTRL}$  with two threshold voltages,  $V_{REF_UP}$  and  $V_{REF_DN}$ . The result of this comparison changes the built-in 4-bit code counter. Once VCO output reaches the target frequency region, the control circuit stops counting and saves the codes to the latches which control the DAC for a desired  $V_{AFC}$ .



Figure 4-6. Proposed VCO with AFC circuit.

The simulated VCO tuning range with different AFC codes are shown in Figure 4-7. The  $K_{VCO}$  varies from 60 MHz/V to 375 MHz/V in simulation when AFC code changes from 0000 to 1111. The VCO provides linear and low  $K_{VCO}$ , and the AFC circuit offers a wide frequency tuning range.



Figure 4-7. Simulated VCO tuning range with different AFC codes
#### 4-2-4. Measurement Results

The proposed PLL is fabricated with standard 65-nm CMOS technology and mounted on FR4 board for measurement. The reference frequency is provided by a signal generator, and the PLL output is measured by an oscilloscope and a spectrum analyzer.

Figure 4-8 shows a die microphotograph and the layout. The total area is about  $0.0081 \text{ mm}^2$  excluding the off-chip loop filter.

Figure 4-9 shows the measured PLL power consumption and power efficiency with varying output frequencies. Since our PLL has the fixed division ratio of 16, PLL output frequency is controlled by changing the reference frequency. With 5.625-MHz to 21.875-MHz reference input, the PLL generates 90-MHz to 350-MHz output with increasing power efficiency. The low-frequency operation is limited by the loop bandwidth and the limiting factor for high-frequency operation is due to SS process corner.

Figure 4-10 shows the measured PLL output spectra at 350 MHz with the fixed CP reference voltage ( $V_{REF}$ ) at 0.25 V and with automatic compensation. The reference spur is -40.3 dBc with fixed  $V_{REF}$  and -55.3 dBc with automatic compensation.

Figure 4-11 shows the measured phase noise of our PLL at 350-MHz output frequency. Its phase noise is -90.5 dBc/Hz at 1-MHz offset.

Figure 4-12 shows the measured jitter characteristics also at 350MHz. The rms jitter is 30.8 ps (0.01 UI).



Figure 4-8. Microphotograph and layout of the PLL.



Figure 4-9. Measured power consumption and power efficiency with different output frequency



Figure 4-10. Measured output spectra at 350 MHz: with the fixed CP reference voltage (0.25V) and with automatic compensation.



Figure 4-11. Phase noise of the PLL at 350 MHz.



Figure 4-12. Measured timing jitter of the PLL at 350 MHz.

Table 4-1 shows the power consumption for each PLL block with 0.4V supply. The overall power consumption is 109  $\mu$ W at 350-MHz output frequency, which corresponds to 0.31 mW/GHz of power efficiency.

| Total Power @ 0.4V, | 109 uW  |  |
|---------------------|---------|--|
| 350 MHz             |         |  |
| PFD                 | 1.04 uW |  |
| СР                  | 12.7 uW |  |
| VCO                 | 54.4 uW |  |
| FD                  | 15.1 uW |  |
| ALF                 | 14.0 uW |  |
| AFC                 | 8.68 uW |  |
| Others              | 2.8 uW  |  |

# TABLE 4-1Power Breakdown

#### 4-3. summary

We demonstrate a 0.4-V, 90~350-MHz PLL in standard 65-nm CMOS technology. The CP in our PLL has an ALF architecture which operates well with ULV supply without current mismatch, resulting in significant reduction in reference spur. It also has a body-biased VCO with an AFC circuit that provides low VCO gain with a wide tuning range. The PLL consumes only 109  $\mu$ W for 350-MHz output, which corresponding to power efficiency of 0.31 mW/GHz.

## **5.** Conclusion

In this dissertation, ultra-low-power and power-efficient PLLs were developed in standard CMOS technology. This research aimed to design new NTV CPs by reducing its non-idealities and creating dualloop VCOs that provide maximum performance at the NTV region. The automatically calibrated VCO with a body-bias technique can provide small VCO gain for a wide range of output frequency. In addition, a PFD with no dead zone problem and very short delay times across PVT variation were designed. We also analyzed several digital FDs and combined an E-TSPC FF with TSPC FFs, resulting in faster operating frequency as well as lower power consumption.

As a result, a fully integrated 500-MHz PLL with power consumption of 127.8  $\mu$ W at 0.4-V was designed in 65-nm CMOS technology in Chapter 3.

We also demonstrated new NTV PLL with ALF CP architecture with a significant reduction in reference spur and OP-Amp's slew-rate burden was demonstrated in Chapter 4.

Figure 5-1 shows the power consumption and power efficiency of state-of-the-art ultra-low supply PLLs again, and compares with our proposed PLLs

Table 5-1 summarizes the performance of our PLL and compares it with state-of-the-art ULV PLLs using the normalized figure of merit (FOM). For this, FOM defined in [45] as,

$$FOM = \frac{area(mm^2)}{\left(\frac{tech.}{0.065}\right)^2} \cdot \left[\frac{mW}{MHz}\right]^{1.5} \cdot [Jitter_{rms}(ps) \cdot \sqrt{mW}]^2$$
(4-3)

is used, where the area is normalized to 65-nm technology.

For fair comparison, the areas of PLL cores that include PFD, CP, VCO, divider and calibration circuits but not the loop filter are used for FoM comparison. As can be seen in the table, our PLLs achieve the lowest reference spur, best power efficiency and great FOM among recently reported ULV PLLs.



Figure 5-1. Power consumption and power efficiency of recent ultra-low supply voltage PLLs and proposed PLLs.

| Performance<br>Parameter<br>Technology<br>(CMOS) | <b>Ref.</b><br>[19]<br>90nm | <b>Ref.</b><br>[23]<br>130nm | <b>Ref.</b><br>[28]<br>180nm | <b>Ref.</b><br>[29]<br>130nm | Our<br>Work<br>[41]<br>65nm | Our<br>Work<br>[30]<br>65nm |
|--------------------------------------------------|-----------------------------|------------------------------|------------------------------|------------------------------|-----------------------------|-----------------------------|
| Supply<br>Voltage (V)                            | 0.5                         | 0.5                          | 0.5                          | 0.5                          | 0.4                         | 0.4                         |
| VCO type                                         | Ring-<br>VCO                | Ring-<br>VCO                 | LC-<br>VCO                   | Ring-<br>VCO                 | Ring-<br>VCO                | Ring-<br>VCO                |
| PLL Core<br>Area (mm <sup>2</sup> )              | 0.012                       | 0.04                         | N/A                          | 0.0736*                      | 0.0081                      | 0.0211                      |
| Power (mW)                                       | 0.4                         | 1.25                         | 4.5                          | 0.44*                        | 0.109                       | 0.13                        |
| Output freq.<br>(GHz)                            | 0.4                         | 0.55                         | 1.9                          | 0.4-<br>0.433                | 0.35                        | 0.5                         |
| Ref. Spur<br>(dBc)                               | -40.28 @<br>2.24GHz         | N/A                          | -43.7                        | -38.3                        | -55.3                       | -59                         |
| RMS Jitter<br>(ps)                               | 9.62                        | 8.01                         | N/A                          | 5.5                          | 30.8                        | 16.9                        |
| Phase Noise<br>@ 1 MHz<br>(dBc/Hz)               | -87                         | -95                          | -120.4                       | -91.5                        | -90                         | -94                         |
| Power<br>effciency<br>(mW/GHz)                   | 1                           | 2.272                        | 2.368                        | 1.016                        | 0.31                        | 0.256                       |
| Normalized<br>FOM                                | 2.19                        | 70.26                        | N/A                          | 2.41                         | 1.4                         | 1.0                         |

TABLE 5-1

#### PERFORMANCE SUMMARY

\*: Including pulse-swallow counter, excluding AFC circuit

## **Bibliography**

- [1] D. Helms, E. Schmidt, W. Nebel, Leakage in CMOS Circuits—An Introduction. In Integrated Circuit and System Design. Power and Timing Modeling, Optimization and Simulation, 14th International Workshop (PATMOS 2004); Springer: Berlin, Germany, 2004; pp. 17–35
- [2] Eitan N. Shauly, "CMOS leakage and power reduction in transistors and circuits: process and layout considerations" Journal of Low Power Electron. Appl., pp. 1-29, Jan. 2012.
- [3] S. Thomson, P. Packan, and M. Bohr, MOS Scaling: Transistor Challenges for the 21st Century, Intel Technology Journal, Q3, 1988.
- [4] Enrico Macii, "Ultra low-power electronics and design", Kluwer academic publishers, 2004.
- [5] F. Pollack, New Microarchitecture Challenges in the Coming Generations of CMOS Process Technologies, Micro32 Keynote, 1999.
- [6] N. Kim et al., Leakage Current: Moore's Law Meets Static Power, IEEE Computer, Vol 36, No. 12, December 2003, pp. 68-75.
- [7] International Technology Roadmap for Semiconductors 2012.[Online]. Available: http://www.itrs.net/Links/2012ITRS/ Home2012.htm.
- [8] Himanshu Kaul, "Near-threshold voltage (NTV) design opportunities and challenges" DAC 2012.

- [9] Surhud Khare, Shailendra Jain, "Prospects of near-threshold voltage design for green computing", Int. Conference on VLSI design and the 12th Int. conference on embedded systems, 2013
- [10] Dejan Markovic, Cheng C. Wang, Louis P. Alarcon, Tshung-Te Liu, and Jan M. Rabaev, "Ultralow-power design in near-threshold region", Proceeding of the IEEE vol. 98, No. 2, pp. 237-252, Feb 2010.
- [11] H. Kaul et al, "A 300 mV 494GOPS/W reconfigurable dual-supply 4-way SIMD vector processing accelerator in 45 nm CMOS", JSSC, Vol. 45, Issue: 1, 2010
- [12] H. Kaul et al, "A 1.45GHZ 52-to-162GFLOPS/W variableprecision floating-point fused multiply-add unit with certainty tracking in 32nm CMOS", ISSCC, 2012.
- [13] B. Razavi, "Monolithic Phase-Locked Loops and Clock Recovery Circuits", *IEEE Press*, 1996.
- [14] J. G. Maneatis, "Low-jitter Process-Independent DLL and PLL Based on Self-Biased Techniques",*IEEE J. Solid-State Circuits*, vol. 31, No. 11, pp. 1723-1732, Nov. 1996.
- [15] Woogeun Rhee, "Design of high-performance CMOS charge pumps in phase-locked loops," *Proc. IEEE Int. Symp. Circuits and Systems*, vol. 1, pp. 545-548, July 1999.
- [16] D. Blaauw, S. M. Martin, T. N. Mudge, and K. Flautner. Leakage current reduction in VLSI systems. Journal of Circuits, Systems, and Computers, 11(6):621–636, 2002.
- [17] Young-Shig. Choi and Dae-Hyun Han, "Gain-Boosting Charge Pump for Current Matching in Phase-Locked Loop," *IEEE Trans.* on Circuits and Systems II: Exp. Briefs, vol.53, no.10, pp. 1022-1025, October 2006.

- [18] M.-S. Hwang, J. Kim, and D.-K. Jeong, "Reduction of pump current mismatch in charge-pump PLL," *Elecronic Letters*, vol. 45, no. 3, pp. 135-136, January 2009.
- [19] Kuo-Hsing Cheng, Yu-Chang Tsai, Yu-Lung Lo, and Jing-Shiuan Huang, "A 0.5-V 0.4–2.24-GHz inductorless phase-locked loop in a system-on-chip," *IEEE Trans.on Circuits and Systems I*, vol. 58, no. 5,pp. 849–859, May 2011.
- [20] Ronald G. Dreslinski, et al, "Near-Threshold computing : reclaiming moore's low through energy efficient integrated circuit
- [21] Balachandran G K, Allen P E. Switched-current circuits in digital CMOS technology with low charge-injection errors. IEEE JSolid-State Circuits, 2002, 37(10): 1271
- [22] Floyd M. Gardner, "Phaselock techniques", Willey-Interscience, 3rd edition, 2005.
- [23] Ting-Sheng Chao, Yu-Lung Lo, Wei-Bin Yang, and Kuo-Hsing Cheng, "Designing ultra-low voltage PLL using a bulk-driven technique," in *Proc. European Solid-State Circuits Conf.*, pp. 388– 391, September 2009
- [24] E. Rubiola, M.Olivier and J. Groslambert, "Phase noise in the regenerate frequency dividers," IEEE Trans. Instrum. Meas. IM-41, pp. 353-360, June 1992.
- [25] S. Verma, H.R. Rategh and T.H. Lee, "A Unified model for injection-locked frequency dividers," IEEE J. Solid-state circuits 38, pp. 813-821, June 2003.
- [26] J. Navarro Soares, Jr. and W. A. M. Van Noije, "A 1.6-GHz dual modulus prescaler using the extended true-single-phase-clock CMOS circuit technique (TSPC)," *IEEE J. Solid-State Circuits*, vol. 34, no. 1, pp. 97–102, Jan. 1999.

- [27] J. Yuan and C. Svensson, "High-speed CMOS circuit technique," *IEEE J.Solid-State Circuits*, vol. 24, no. 1, pp. 62–70, Feb. 1989.
- [28] Heieh-Hung Hsieh, Chung-Ting Lu, and Liang-Hung Lu, "A 0.5-V
  1.9-GHz low-power phase-locked loop in 0.18-um CMOS," in *Proc. Symp. VLSI Circuits (VLSIC)*, pp. 164–165, June 2007.
- [29] Wu-Hsin Chen, Wing-Fai Loke, and Byunghoo Jung, "A 0.5-V, 440-uW frequency synthesizer for implantable medical devices," *IEEE Journal of Solid-State Circuits*, vol. 47, no. 8, pp.1896-1907, August 2012.
- [30] Joung-Wook Moon, Sung-Guen Kim, Dae-Hyun Kwon, and Woo-Young Choi, "A 0.4-V, 500-MHz, Ultra-low-power phase-locked loop for near-threshold voltage operation," *IEEE Custom Integrated Circuits Conference*, pp1-4, September, 2014
- [31] N. H. E. Weste and K. Eshragrian, "principles of CMOS VLSI Design," 2nd. Reading, MA : Addison Wesley, 1993.
- [32] B. Razavi, Design of Analog CMOS Integrated Circuits, 2001: McGraw-Hill.
- [33] M. Mansuri, D. Liu, and C.-K. K. Yang, "Fast frequency acquisition pahse-frequency detectors for Gsamples/s phase-locked loops", *IEEE J. Solid-State Circuits*, vol.37, no. 10, pp. 1331-1334, 2002.
- [34] C.-N. Chuang and S.-I. Liu, "A 0.5-5-GHz wide-range multiphase DLL with a calibrated charge pump," *IEEE Trans.on Circuits and Systems II, Exp. Briefs*, vol. 54, no. 11, pp. 939-943, November 2007.
- [35] M. Jalalifar and G.-S. Byun, "Near-threshold charge pump circuit using dual feedback loop," *Elecronic Letters*, vol.49, no.29, pp.1436-1438, November 2013.

- [36] M. Jamal Deen, Mehdi H. Kazemeini, and Sasan Naseh, "Performance characteristics of an ultra-low power vco", in *Proc. IEEE Int. Symp. on Circuits and Systems (ISCAS)*, May 2003, pp. 697–700.
- [37] Sung-Mo Kang and Yusuf Lebiebici, "CMOS Digital Integrated Circuits" WBC/McGraw-Hill, New York 1999.
- [38] W.B. Wilson, U.K. Moon, K.R. Lakshmikumar and L. Dai, "A CMOS self-calibrating frequency synthesizer," IEEE J. Solid-State Circuits, vol, 35, pp. 1437–1444, Oct. 2000
- [39] Adem Aktas and Mohammed Ismail "CMOS PLL calibration techniques", *IEEE Circuits Devices Mag.*, vol. 20, no. 5, pp.6 -11 2004.
- [40] Jan M. Rabaey, Anantha Chandrakasan, and Borivoje Nikolic, "Digital integrated circuit - A design perspective", Prentice hall 2003.
- [41] Joung-Wook Moon, Kwang-Chun Choi, and Woo-Young Choi, "A 0.4-V, 90~350-MHz PLL with an active loop-filter charge pump," *IEEE Trans. on Circuits and Systems II: Express Briefs*, vol.61, no.5, pp. 319-323, 2014.
- [42] Lin Li, Luns Tee, and Paul R. Gray, "A 1.4 GHz differential lownoise CMOS frequency synthesizer using a wideband PLL architecture," *IEEE Int. Solid-State Circuits Conf. Digest of Tech. Papers*, pp.204-205 Febuary 2000.
- [43] Masaomi Toyama, Shiro Dosho and Naoshi Yanagisawa, "A design of a compact 2GHz-PLL with a new adaptive active loop filter circuit," in *Proc. Symp. VLSI Circuits (VLSIC)*, pp. 185-188, June 2003.

- [44] Chi-Nan Chuang and Shen-Iuan Liu, "A 0.5-5-GHz wide-range multiphase DLL with a calibrated charge pump," *IEEE Trans.on Circuits and Systems II, Exp. Briefs*, vol. 54, no. 11, pp. 939-943, November 2007
- [45] A.M. Fahim, "A compact, low-power low-jitter digital PLL," in Proc. European Solid-State Circuits Conf., pp. 101-104, September 2003.

### **Abstract (In Korean)**

# 문턱 전압 영역에서 동작하는

# 초저전력 위상 동기 루프

대부분의 집적회로에서 파워소모는 설계 시 중요한 고려대상이 되고 있고, 그 중 집적회로의 클럭 생성을 목적으로 하는 저전력 위상 동기 루프는 매우 활발한 연구대상이 되고 있다. 파워소모를 줄이기 위해서 가장 효과적인 방법으로 전원전압을 낮추는 기술이 사용되고 있지만 이러한 방법이 항상 파워효율 또는 에너지효율을 증가시키지는 못한다.

본 논문에서는 문턱전압에서 동작하는 초 저전력, 고효율의 특성을 갖는 위상 동기 루프를 표준 CMOS 공정을 사용하여 구현하였다. 전압을 낮추게 됨으로써 발생하는 전압 헤드 룸 문제를 해결하기 위하여 새로운 차지 펌프와 위상 동기 루프 구조가 제안되었다.

65 나노 표준 CMOS 공정으로 제작된 위상 동기 루프는 모든 회로가 집적되어 0.4-V 전원전압에서 500 MHz 주파수로 동작한다. 새롭게 제안된 전류 펌프는 완벽한 전류 정합 특성과 함께 전류 펌프의 출력전압에 따른 전압변화를 줄였다. 자동 보정되는 바디바이어스 기술이 사용된 전압제어 발진기는 넓은 주파수영역에서 매우 낮은 이득을 보여준다. 또한, 위상주파수 검출기는 PVT 변화에 둔감하고, 데드존을 없앨 수 있도록 설계가 되었다. 아울러, 우리는 E-TSPC와 TSPC 디지털 주파수분주기들에 대해 분석을 하였고, 이는 빠른 주파수속도와 저전력을 제공한다. 이렇게 제안된 위상 동기 루프는 127.8 μW 의 파워소모를 함으로써 0.256 mW/GHz 의 높은 파워효율을 얻을 수 있었다.

본 논문은 위상 동기 루프의 구조에 대해서 확장을 시도하였다. 제안된 새로운 구조의 문턱전압에서 동작하는 위상동기 루프는 동적 루프필터 구조의 전류펌프를 포함하고 있다. 65 나노 CMOS 공정으로 제작된 위상동기 루프는 기준 스퍼를 크게 감소시키고, 증폭기의 슬루 레이트에 대한 부담을 줄일 수 있게 한다. 제안된 위상 동기 루프는 0.4-V 전원전압으로 90~350 MHz 동작 주파수를 가지며, 0.31mW/GHZ 의 파워효율을 갖는다.

핵심 단어: 공급조절 액티브 루프 필터, 문턱 전압 영역, 위상 동기 루프, 자동 주파수 보정기, 전류펌프, 전류부정합, 전류변화, 초저전력, 초저전압, 전압 제어 발진기.

113