

# FEASIBILITY STUDY OF FREQUENCY DOUBLING USING AN AN XOR-GATE METHOD

Song Ma MSc. Thesis January 2013

FACULTY OF ELECTRICAL ENGINEERING, MATHEMATICS & COMPUTER SCIENCE CHAIR OF INTEGRATED CIRCUIT DESIGN

EXAMINATION COMMITTEE Prof.dr.ir. B. Nauta Dr.ir. A.J. Annema Dr.ir. M.J. Bentum E. Olieman, MSc.

DOCUMENT NUMBER ELECTRICAL ENGINEERING - 067.3496

**UNIVERSITY OF TWENTE.** 

15/01/2013

# Feasibility Study of Frequency Doubling using an XOR-gate Method

Song Ma

S1069349

Supervisor: Dr. Anne-Johan Annema

Msc. Erik Olieman

### Contents

| Abstract                                                            | 5  |
|---------------------------------------------------------------------|----|
| Chapter 1 Introduction                                              | 7  |
| 1.1 Motivation                                                      | 7  |
| 1.2 Project Goal                                                    | 8  |
| 1.3 Specifications                                                  | 8  |
| 1.4 Solution Directions                                             | 9  |
| 1.5 Report Outline                                                  | 9  |
| 1.6 Summary                                                         | 10 |
| Chapter 2 Architecture Level Considerations                         | 11 |
| 2.1 Definitions                                                     | 11 |
| 2.1.1 Definition of Edge                                            | 11 |
| 2.1.2 Duty Cycle Error                                              | 11 |
| 2.1.3 Definition of Jitter/ Phase Noise                             | 13 |
| 2.1.4 Relation Between Random Voltage Noise, Jitter and Phase Noise | 13 |
| 2.2 Architecture Level Analysis                                     | 15 |
| 2.2.1 Overview                                                      | 15 |
| 2.2.2 Crystal Oscillator                                            | 15 |
| 2.3 Summary                                                         | 16 |
| Chapter 3 Frequency Doubling                                        | 19 |
| 3.1 Analog Method                                                   | 19 |
| 3.1.1 Idea                                                          | 19 |
| 3.1.2 Circuit Implementation                                        | 19 |
| 3.1.3 Performance Analysis                                          | 27 |
| 3.1.4 Limitations                                                   |    |
| 3.2 Digital Method                                                  |    |
| 3.2.1 Idea                                                          |    |
| 3.2.2 Circuit Implementation                                        |    |
| 3.2.3 Performance Analysis                                          |    |
| 3.2.4 Limitations                                                   | 41 |
| 3.3 Summary Comparison Between Analog Method and Digital Method     |    |
| 3.3.1 Same Properties                                               |    |
| 3.3.2 Difference                                                    |    |
| Chapter 4 90° Delay Cell                                            | 45 |
| 4.1 Overview of the DLL                                             | 45 |

| 4.2 Implementation of the 90° Delay Cell47                    |
|---------------------------------------------------------------|
| 4.2.1 Delay Element                                           |
| 4.2.2 Implementation of the 90° Delay Cell48                  |
| 4.2.3 Limitations                                             |
| 4.3 Summary                                                   |
| Chapter 5 Duty Cycle Correction Circuit (DCC)53               |
| 5.1 Idea53                                                    |
| 5.2 Circuit Implementation54                                  |
| 5.2.1 Error Detection54                                       |
| 5.2.2 Error Correction55                                      |
| 5.2.3 Entire Duty Correction Circuit55                        |
| 5.3 Performance Analysis61                                    |
| 5.3.1 Duty Cycle Rate61                                       |
| 5.3.2 Phase Noise                                             |
| 5.3.3 Power Dissipation66                                     |
| 5.4 Limitations67                                             |
| 5.4.1 The Size of Switch67                                    |
| 5.4.2 Tail Current Source and Capacitor C67                   |
| 5.5 Summary67                                                 |
| Chapter 6 Conclusion and Recommendations for Further Research |
| 6.1 Summary69                                                 |
| 6.2 Conclusion                                                |
| 6.3 Recommendations for Further Research71                    |
| Chapter 7 Comparison Between This Work and R. Oortgiesen's73  |
| Reference75                                                   |
| Acknowledge                                                   |

# Abstract

The performance of the integrated Frequency Synthesizers relies on a clean fixed reference frequency, which is usually derived from a crystal. Unfortunately, commercial low cost crystal oscillators are limited in the range of 20-50MHz. In general, a higher reference frequency results in better noise performance for Frequency Synthesizers. Therefore it is desired to be able to multiply the reference frequency and at the same time to preserve the clean crystal properties.

This work examines the feasibility of a low power and low noise CMOS Frequency Doubler in CMOS IC-technology. Main target specifications are: -151 dBc/Hz phase noise floor, 10kHz flicker noise corner frequency and precise 50% duty cycle rate within a power budget of approximately 4mW. Within this scope a known analog method has been analyzed, which has been proven to be sub-optimal.

A digital method has been proposed using an XOR-gate. The basic idea is that the frequency of the input clock is doubled at the output of an XOR-gate if the two input clock signals have 90° phase shift. The advantages are that it is a highly digital circuit which implies low power dissipation and that phase noise floor is 10dBc/Hz lower than for the analog method at the cost of power less than 2uW at 20MHz, which is a factor of hundred lower power than for the analog method. However, the major drawback of this approach is that static timing errors, due to the different transition times of NMOS transistors and PMOS transistors, spoil the duty cycle rate—it is not 50% anymore. A duty cycle correction circuit is therefore added to detect the error and correct this duty cycle error. The system has been analyzed on system level and implemented and simulated on circuit level.

# **Chapter 1 Introduction**

## **1.1 Motivation**

Wireless communication takes an increasingly role in our everyday lives. For example, when people use cells phone to talk with others, their mobile phones convert human voice (low frequency) into a radio frequency (RF) signal (high frequency) and this radio frequency signal will be received by another mobile phone and is converted into a human voice backbaseband signal (low frequency) again. This receiving action is depicted in figure 1.1 which is a typical RF receiver architecture. From figure 1.1, we can see that the conversion from RF to Baseband is carried out by a mixer which is controlled by a frequency synthesizer.



Figure 1.1 A typical RF receiver architecture

A frequency synthesizer generates a variety of stable tunable frequencies which are used to tuned to the radio frequency of interest. It relies on a clean fixed reference frequency which is usually derived from a crystal and determines for a big part of performance of the frequency synthesizer as indicated in figure 1.2



Figure 1.2 A typical architecture of frequency synthesizer

Unfortunately, commercial cheap crystals are limited in the range of 20-50MHz. For a fractional-N frequency synthesizer, a higher reference frequency allows to reduce the noise contribution from the sigma-delta modulator in the fractional-N synthesizer [1]. Therefore there is the desire to double ( or even better multiply) the reference frequency and at the same time to preserve the clean properties of crystals.

# **1.2 Project Goal**

The goal of this project is to examine the feasibility of a low power and low noise frequency doubling circuit to be used between the fixed reference frequency (crystal) and the frequency synthesizer, with the purpose of frequency doubling seen in Fig. 1.3.



Figure 1.3 System perspective of frequency doubler sub-section

# **1.3 Specifications**

Since the frequency doubler (Fig. 1.3) will act as a fixed reference frequency generator for the frequency synthesizer, it is not hard to imagine that the frequency doubler better keep the same properties in terms of noise, power and duty cycle rate compared with a crystal. This puts relatively high demands on the doubler sub-section since a crystal oscillator has inherently very good noise properties.

In this project, the same specifications are used as R. Oortgiesen derived in his work [2]. It is illustrated in Fig. 1.4 and given in table 1.1.



Figure 1.4 Specifications overview

The doubled output frequency, thus in the range of 40 - 100 MHz, has a phase noise floor of -151dBc/Hz. The 1/ f corner frequency, which is usually dominated by the flicker noise of MOSFETS in for example buffers, should not exceed 10kHz. Furthermore the duty cycle rate

| Parameter                     | Typical           | Maximum | Unit   |
|-------------------------------|-------------------|---------|--------|
| Output frequency              | 2*f <sub>in</sub> |         | Hz     |
| Output phase noise floor      | -151              | -149    | dBc/Hz |
| Output phase noise 1/f corner | 10k               | 15k     | Hz     |
| Duty cycle rate               | 50%               |         |        |
| Power dissipation             | 4m                |         | Watt   |

should be close to 50%. These specifications have to be met within a power budget of roughly 4mW.

Table 1.1 Target performance specifications

## **1.4 Solution Directions**

This document first examines the frequency doubler employing an analog method, proposed by R. Oortgiesen in his work [2]. Within the given specifications, the analog method was analyzed, and shown to be sub-optimal.

Next to the analog method, an alternative method using an XOR gate and a delay element is explored. The later method has been used in front of the  $\Sigma\Delta$  fractional synthesizer in [1] [3] to reduce the in-band phase noise. The self-evident advantage of this method is that the circuit is highly digital which implies low power dissipation. However, the major drawback is that static timing errors, due to the different transition times of NMOS transistors and PMOS transistors, spoil the duty cycle rate-- it is not 50% anymore. A duty cycle correction circuit (DCC) is therefore necessary to detect the error and correct the error.

This thesis describes research into the feasibility of doubling circuit with the use of a delay element and an XOR-gate and to use the proposed correction circuit to correct the duty cycle rate making it approach to 50%.

A UMC 130nm CMOS technology is used in this project.

# **1.5 Report Outline**

First, architecture level considerations will be discussed in chapter 2 in which several critical definitions are explained at first and then architecture level analysis is followed.

In chapter 3, frequency doubling is analyzed in two ways. In the first, the analog method is discussed. Due to its limitation, an alternative method using the digital circuit is motivated which is introduced in the second section. Third part shows a brief comparison between the analog method and the digital method.

Chapter 4 gives a simple realization of a 90° delay cell while keeping both the rising edge and the falling edge accurate at the same time.

In chapter 5, a duty cycle correction circuit is introduced which is required since the duty cycle is destroyed after frequency doubling either in an analog way or in a digital way.

Chapter 6 follows with the conclusions and recommendations for further research.

Chapter 7 shows a brief comparison between this work and R. Oortgiesen's.

# 1.6 Summary

Chapter one starts with the motivation of this project—higher reference frequency is needed for the frequency synthesizer to reduce the noise contribution. Therefore, there is the desire to double the reference frequency and at the same time to preserve the clean properties of crystals. After setting the project goal, the required specifications are presented—they are phase noise floor less than -151dBc/Hz, power dissipation less than 4mW and precise 50% duty cycle rate. Then the solution directions are analyzed—two ways to do the frequency doubling, analog and digital. This work focuses on the feasibility of the doubling circuit with the use of a delay element and an XOR-gate and use the correction circuit to adjust the duty cycle rate making it approach to 50%. The report outline is given in the end.

# **Chapter 2 Architecture Level Considerations**

## 2.1 Definitions

Before starting the actual architecture analysis, several important definitions are clarified; these include the edge definition, the duty cycle error, and the definition of jitter/ phase noise.

### 2.1.1 Definition of Edge

In this work, the middle point of the edge defines the actual position of the edge. When a clock varies from a low level to a high level or from a high level to a low level, it takes certain transition times which are called the rising time and the falling time, respectively, as shown in figure 2.1(a). Therefore it is necessary to pick a point from the transition times to define the actual position of the edge. The mid point of the edge is an optimal choice to define the edge which is analogous to the zero-crossing point of a sine wave as indicated in figure 2.1 (b).



Figure 2.1 Definition of edge

### 2.1.2 Duty Cycle Error

### (1) Duty cycle rate (DCR)

The duty cycle rate (DCR) is defined as shown in equation (2.1) in which T is the period of the clock and  $t_{on}$  represents the on-time from the rising edge to the falling edge as illustrated in figure 2.2 in one period. For example, when  $t_{on}$  is the half value of T, DCR is 50%. In this project, precise 50.000% duty cycle rate is one of the required specifications.

$$DCR = \frac{t_{on}}{T} \cdot 100\% \tag{2.1}$$

#### (2) Duty cycle error

As seen in figure 2.2, an error originates when the clock has unequal adjacent period times, which is mostly commonly referred to as duty cycle error. A duty cycle error of 1% means that the duty cycle rate is either 49% or 51%, giving unequal adjacent periods.



Figure 2.2 Definition of duty cycle rate

The duty cycle error consists of two types of errors. One is the static error and the other is the random error which is jitter/phase noise and will be discussed in next sub-session.

A static error is caused by different transition times between NMOS transistors and PMOS transistors. To better understand, figure 2.3 shows an example. Assuming that a sine wave with perfect 50% duty cycle rate as shown in figure 2.3 (a) is converted into a square wave as indicated in figure 2.3 (c) by two inverters as illustrated in figure 2.3 (b), the duty cycle rate of the square wave is spoiled- it is less than 50% because of different transition times between NMOS transistors and PMOS transistors. When the size of the inverters are fixed, the static error is fixed as well.

To remove this static error, additional duty cycle correction circuit is needed to detect the static error and to correct it. The duty cycle correction circuit (DCC) is going to be specifically discussed in chapter 5.

In this report, when the duty cycle error is mentioned, it means static error. Random errors will be separately identified as jitter/phase noise.



Figure 2.3 Static error

### 2.1.3 Definition of Jitter/ Phase Noise

As seen in figure 2.4, a rising edge may come a little early  $\Delta t$  or come a little late  $\Delta t$ ;  $\Delta t$  which is a random time displacement caused by random noise ( random voltage noise V<sub>n</sub> in the case of figure 2.4). Consequently, this random time displacement  $\Delta t$  is defined as jitter in the time domain and is identified as phase noise in the frequency domain as illustrated in figure 2.4.



Figure 2.4 Definition of jitter/phase noise

### 2.1.4 Relation Between Random Voltage Noise, Jitter and Phase Noise

#### (1) $V_n$ and $\Delta t$ ( $\tau_{rms}$ )

As mentioned in previous sub-section, jitter  $\Delta t$  is caused by the random voltage noise  $V_n$ . The directly observed noise at middle point (crossing moment) of the edge causes a corresponding time displacement of the middle point (jitter) whose magnitude depends on the steepness of the edge as indicated in figure 2.5. Therefore, noise in the voltage domain is related to jitter in the time domain by the time derivative of the edge or slew rate (SR) as shown in equation (2.2). And it is only the  $V_n$  at middle point ( crossing moment ) that are generating  $\Delta t$  and  $V_n$  at other points has zero effect on the time displacement.



Figure 2.5 Relation of noise in the voltage domain and jitter in the time domain.

$$\Delta t(t_i) = \frac{V_n(t_i)}{SR(t_i)}$$
(2.2)

In which,  $V_n$  means the root mean square voltage noise at the middle point of the edge and SR represents the slew rate at this point.

Based on equation (2.2), less voltage noise and larger slew rate are needed to have less jitter in the time domain.

#### (2) Δt and phase noise L(f)

According to the paper [4], the relation between phase noise and jitter can be expressed in the following equation (2.3)

$$L(f) = 2 \cdot \left(\frac{\pi f_{out}}{SR}\right)^2 \cdot S_{V_n}(f)$$
$$= 2\pi^2 \cdot f_{out} \cdot \tau_{rms}^2 = 2\pi^2 \cdot f_{out} \cdot \left(\frac{V_n}{SR}\right)^2$$
(2.3)

In which L(f) means output phase noise floor,  $f_{out}$  is the frequency of the output signal, SR indicates the slew rate of the edge, and  $S_{Vn}(f)$  is the power spectral density of the voltage noise. And the unit of L is dBc/Hz.

According to the specifications in chapter 1, the required phase noise is -151dBc/Hz which is mirrored into 1ps jitter in the time domain when  $f_{out}$  is 40MHz.

# 2.2 Architecture Level Analysis

### 2.2.1 Overview

The architecture of a frequency doubler is shown in figure 2.6 in which a crystal oscillator generates a clean reference frequency  $f_{in}$  as input signal for the frequency doubler. After frequency doubling, the frequency of the output signal  $f_{out}$  is double of  $f_{in}$ . However, the duty cycle rate of  $f_{out}$ , as mentioned in chapter 1.4, is spoiled after frequency doubling—it is not 50% anymore. A duty cycle correction circuit is needed to detect this duty cycle error and correct it after frequency doubling as indicated in figure 2.7. Therefore the architecture of the frequency doubler is composed of three parts, crystal oscillator, frequency doubling circuit and duty cycle correction circuit.

The frequency doubling can be realized in two ways—using an analog method and using a digital method which are discussed in chapter 3. The duty cycle correction circuit is specifically analyzed in chapter 5. The crystal oscillator is introduced in next sub-section.



Figure 2.6 The architecture of frequency doubler



Figure 2.7 The architecture of frequency doubler with DCC

### 2.2.2 Crystal Oscillator



Figure 2.8 A typical Pierce configuration oscillator circuit.

A typical Pierce configuration crystal oscillator circuit is shown in figure 2.8. The detailed introduction of crystal oscillator is omitted here and readers may go to the book[4]. In this project, the output of this Pierce configuration oscillator is used as the fixed clean reference frequency clock and is modeled as a sine wave with perfect 50% duty cycle rate and full output swing from zero to the supply voltage—1.2V (determined by UMC 130nm CMOS technology itself) as indicated in figure 2.9 (a).



Figure 2.9 Output waveform of crystal oscillator

However, it is necessary to convert this sine wave into a square wave. Normally, the clean fixed reference frequency for a PLL is usually a square wave instead of a sine wave because a square wave has more steeper slew rate than a sine wave which means that phase noise will be less based on the equation (2.2) and (2.3). Therefore an extra circuit is needed to do this conversion; for example, using two inverters or adding a comparator.

But, either using inverters or a comparator, the duty cycle rate is impaired as illustrated in figure 2.9 (b) due to the different transition times between NMOS transistors and PMOS transistors. To improve the duty cycle rate, a duty cycle correction is needed as mentioned in chapter 2.1.2.

Figure 2.10 shows the entire architecture of frequency doubler.



Figure 2.10 Practical architecture of frequency doubler

### 2.3 Summary

Chapter 2 introduces the definition of edge (mid point), the duty cycle error consisting of two types of errors—a static error caused by the different transition times between NMOS transistors and PMOS transistors and a random error created by the random voltage noise at the middle point of the edge. Then the definition of jitter/ phase noise is presented and also the relation between the random voltage noise, jitter and phase noise is followed. To

remove the static error, a duty cycle correction circuit is added. And to have less phase noise, a steeper slew rate and the lower voltage noise is needed. In the second section of this chapter, we discussed the architecture level considerations of the doubler circuit to gain the architecture schematic of the frequency doubler. In the end, a short introduction about the crystal oscillator is given and the complete architecture schematic is presented.

# **Chapter 3 Frequency Doubling**

As mentioned in chapter 2, there are two methods to do the frequency doubling. One is an analog method and the other is a digital method. In this chapter, an analog method is analyzed in the first session. Due to the problems and limitation of the analog method, digital method is motivated. Then a brief comparison between these two methods is given.

# 3.1 Analog Method

### 3.1.1 Idea

The core idea of the work in [2] is to use both edges of the input clock (both the rising and the falling) and to combine them in such a way that both the falling and the rising edges become the rising edges, which is illustrated in figure 3.1.



Figure 3.1 Core idea: Use both edges.

To make sure that the frequency of the output clock is double of the frequency of the input clock, the duty cycle rate of the input clock must be very precise 50%. Otherwise, the frequency of the output clock is not exactly doubled.

### 3.1.2 Circuit Implementation

A means of realizing a Dual-Edge Doubler implementation, as described in the foregoing sections, is to make use of the properties of a differential amplifier. A differential pair typically amplifies the difference between the two input signals, with the properties of having common-mode rejection, high rejection of supply noise, and high output swings ( compared to singled-ended). The differential signal processing can be exploited when using a differential pair as the doubler circuit.

### (1) Theoretical analysis

The schematic is shown in figure 3.2.

 $V_s$  is the input clock-- fixed clean reference input clock which is modeled as the output clock of crystal oscillator.  $V_c$  is the common-mode voltage which is a fixed value-- 0.6V. M9-M12 are four switches controlled by clock CLK and NCLK<sup>1</sup>. M1, M2, R1, R2 and I<sub>s1</sub> forms the first differential stage amplifying the difference between two input signals  $V_{in1}$  and  $V_{in2}$ . M5, M6, M7, M8 and I<sub>s2</sub> forms the second differential stage acting as a comparator converting twoend input signal into single-ended output clock.



Figure 3.2 Schematic of analog method frequency doubler

To understand how this doubler circuit works, let's have a look at figure 3.3 which indicates the theoretical waveform of corresponding nodes in doubler circuit.

 $V_s$  is the input clock. CLK and NCLK are two clocks controlling the switches M9-M12 and they have 180° phase shift. Meanwhile, CLK has 90° early than  $V_s$  and NCLK has 90° delay than  $V_s$  as illustrated in figure 3.3 (a), (b) and (c).



Figure 3.3 Theoretical waveform of (a) input clock V<sub>s</sub> (b) input clock CLK (c) input clock NCLK

<sup>&</sup>lt;sup>1</sup> R. Oortgiesen simply assumed that crystal oscillator can offer  $V_s$ , CLK, and NCLK these three clocks. However, as discussed in chapter 2.2.2, crystal oscillator can only create a single-ended output clock which means a delay-locked loop (DLL) is needed to derive another two clocks CLK and NCLK from crystal oscillator. The DLL will be discussed in chapter 4.

When CLK is "on "and NCLK is "off ", switches M9 and M11 are "on " and switches M10 and M12 are "off ". During this half period,  $V_{in1}$  equals to  $V_s$  and  $V_{in2}$  equals to  $V_c$ . In like manner,  $V_{in1}$  equals  $V_c$  and  $V_{in2}$  equals  $V_s$  during the next half period as indicated in figure 3.3 (d) and (e).



Figure 3.3 Theoretical waveform of (d) positive input of the first different stage V<sub>in1</sub> (e) negative input of the first differential stage V<sub>in2</sub>

Then the differential input of the first differential pair stage  $V_{in1}$ - $V_{in2}$ , already realizes the frequency doubling as shown in figure 3.3 (f). After two stages of amplification,  $V_{out}$  is obtained.



Figure 3.3 Theoretical waveform of (f) differential input of the first differential stage  $V_{in1}$ - $V_{in2}$  (g) frequency doubled output clock  $V_{out}$ 

To check the accuracy of theoretical analysis, the simulation investigation is followed.

#### (2) Simulation investigation

Based on the schematic of the doubler circuit in figure 3.2, it is configured as shown in figure 3.4.  $V_s$  is the input clock with low level= 0.3V and high level= 0.9V.  $V_c$  is the common-mode voltage—0.6V. All transistors are scaled with minimal length=0.13um. Switches are dimensioned with the same width= 10um. Both transistors M3 and M4 acts as current source—DC bias current=300uA. R1 and R2= 4k $\Omega$ . The supply voltage V0=1.2V. M1, M2, M5, and M6 are identical scaled with the same width= 12.42um. M7 and M8 are also symmetrical with the same width= 4.59um. The simulation result is shown in figure 3.5 (a) (b) (c) (d) (e) (f) (g).



Figure 3.4 A practical schematic of the analog frequency doubler circuit

The simulation results are almost identical with the theoretical analysis. Comparing figure 3.3 (a) (b) (c) (d) (e) and (f) with figure 3.5 (a) (b) (c) (d) (e) and (f), the simulation waveforms are in accordance with the theoretical waveforms. However, there is a slightly difference between the output clock as seen in figure 3.3 (g) and in figure 3.5 (g): in figure 3.5 (g) the first low level is a little higher than the second low level. This is because the second differential pair is loaded with an active current mirror and not with two identical resistors. Due to the asymmetry of an active current mirror, this differential pair is not as symmetrical as the first differential stage. Therefore in figure 3.5 (g) the first low level is not as close to zero voltage as the second low level. But this little difference can be ignored, because it will be compensated by the following stage—the duty cycle correction circuit (DCC ). In one sentence, the simulation results are consistent with the theoretical analysis and proves the feasibility of using the analog method for realizing frequency doubling.



Figure 3.5 (a) Simulation result of the input source Vs







Periodic Steady State Response

Figure 3.5 (c) Simulation result of the input clock NCLK





Periodic Steady State Response











Periodic Steady State Response

Figure 3.5 (g) Simulation result of the output clock  $V_{\mbox{\scriptsize out}}$ 

#### **3.1.3 Performance Analysis**

According to the specifications in chapter 1.3, the phase noise and the power dissipation are used to evaluate this analog frequency doubler. The duty cycle rate is skipped in here as it is spoiled after frequency doubling because of the different transition times between NMOS transistors and PMOS transistors and the DCC will correct it afterwards. Therefore, in this section, phase noise and power dissipation are used to quantify the performance of the analog frequency doubler.

#### (1) Phase noise

Recall the equation (2.3),

$$L(f) = 2\pi^2 \cdot f_{out} \cdot \left(\frac{V_n}{SR}\right)^2$$
(2.3)

To find the phase noise floor,  $V_n$  and SR are needed, both of which can be obtained by simulation with cadence using time-domain Pnoise analyses. Figure 3.6 (b) shows an example of a simulation waveform of  $V_n$  with the same configuration of the frequency doubler as indicated in figure 3.4. The SR can be also easily measured as seen in figure 3.6 (a). Therefore, based on the figures showed in figure 3.6 (b), the phase noise floor can be calculated as

L= 
$$2^* (3.14)^{2*} 40E6^* ((11.15E-3)/(1.67E10))^{2} \approx -154.5 \text{ dBc/Hz}$$
 at  $40\text{MHz}$ , which is not bad.



Figure 3.6 (a) Simulation result of the output clock V<sub>out</sub>



Figure 3.6 (b) Simulation result of the integrated voltage noise  $V_n$ 

Periodic Steady State Response



Figure 3.6 (c) Simulation result of total current consumption  $I_{total}$ 

Meanwhile, two factors are critical to reduce phase noise—decreasing  $V_n$  and enlarging SR. To have small  $V_n$  and large SR, a higher differential input swing and a larger tail current source of differential pair can be used.

#### (a) Differential input swing

The larger differential input swing leads to smaller  $V_n$  and steeper SR, with which the phase noise of output clock is cut down as well when keeping the DC bias current constant in two differential pairs. This can be explained in this way. When using the larger differential input swing, both the slew rate of the input clock and the output clock will be increased. And the higher slew rate of the input clock means that there is less transition time when the clock varies between the high level and the low level, which leads to less  $V_n$  as well. Since the slew rate of the output clock increases and the integrated voltage noise of the output clock decreases, the phase noise will go down. Therefore to have less phase noise, differential input swing should be as big as possible.

Table 3.1 lists the simulation results that shows the relation between the differential input swing and the SR, the V<sub>n</sub> and the phase noise using the same configuration as in figure 3.4. When the swing increases from [-1, 1] to [-300, 300]mV, for the rising edge, the SR rises up from 4.22E8 to 1.82E10 and the V<sub>n</sub> goes down from 49.87mV to 5.81mV, for the falling edge, the SR rises up from 2.15E8 to 2.31E10 and the V<sub>n</sub> goes down from 52.23mV to 11.13mV. As a result, the phase noise reduces from -103.32.dBc/Hz to -155.21dBc/Hz.

| Differential input swing<br>(mV) | Slew rate<br>V <sub>n</sub> | Rising edge | Falling edge | Phase noise    |
|----------------------------------|-----------------------------|-------------|--------------|----------------|
| [-1, 1]                          | SR                          | 4.22E8      | 2.15E8       | -103.32 dBc/Hz |
|                                  | Vn                          | 49.87mV     | 52.23mV      |                |
| [-10, 10]                        | SR                          | 3.29E9      | 2.98E9       | -127.69 dBc/Hz |
|                                  | Vn                          | 36.58mV     | 43.77m       |                |
| [-100,100]                       | SR                          | 1.52E10     | 1.88E10      | -152.94 dBc/Hz |
|                                  | Vn                          | 11.80mV     | 15.09mV      |                |
| [-200,200]                       | SR                          | 1.79E10     | 2.24E10      | -154.34 dBc/Hz |
|                                  | Vn                          | 7.09mV      | 12.30mV      |                |
| [-300,300]                       | SR                          | 1.82E10     | 2.31E10      | -155.21 dBc/Hz |
|                                  | Vn                          | 5.81mV      | 11.13mV      |                |

Table 3.1 Relation between differential input swing and SR, V<sub>n</sub> and phase noise @ I\_tail=300uA

#### (b) Tail current source of differential pair.

A larger tail current gives rise to less  $V_n$  and a steeper SR. Table 3.2 shows the relation between the tail current and the SR, the  $V_n$ , the phase noise and the power dissipation.

| Tail current<br>(uA) | Slew rate<br>V <sub>n</sub> | Rising edge | Falling edge | Phase noise   | Power<br>dissipation |
|----------------------|-----------------------------|-------------|--------------|---------------|----------------------|
| 100                  | SR                          | 1.69E10     | 2.17E10      | -151.68dBc/Hz | 0.2mW                |
| 100                  | Vn                          | 9.47mV      | 20.13mV      |               |                      |

| 200 | SR | 1.8E10  | 2.31E10 | -155.3dBc/Hz  | 0.4m     |
|-----|----|---------|---------|---------------|----------|
| 200 | Vn | 6.92mV  | 14.08mV |               | 0.411100 |
| 200 | SR | 1.82E10 | 2.31E10 |               | 0.6mW    |
| 500 | Vn | 5.77mV  | 10.94mV | -157.5UBC/HZ  | 0.0111   |
| 400 | SR | 1.88E10 | 2.4E10  | -158.55dBc/Hz | 0.9m///  |
| 400 | Vn | 5.29mV  | 10.09mV |               | 0.01110  |

Table 3.2 Relation between tail current and SR, Vn, phase noise and power dissipation @ input swing [-300mV,300mV]

Even though, using larger tail current can decrease phase noise, the power consumption is increased proportional to the value of the tail current, which indicates a trade-off between power dissipation and phase noise.

#### (2) Power dissipation

Power dissipation is determined by the DC bias current source—more current, more power cost which is clearly illustrated in table 3.2.

#### **3.1.4 Limitations**

#### (1) Precision of the duty cycle correction circuit

A duty cycle correction circuit is needed to make sure the frequency of the output clock is exactly double of the frequency of the input clock. Looking at figure 3.7, the accuracy of edge 3 and 4 determines whether the frequency of the output clock is exactly doubled or not. However, edges 1 and 2 determines the accuracy of edges 3 and 4. So to have precise frequency doubling, edge 1 and 2 must be precise enough – the duty cycle rate of input clock V<sub>s</sub> must be precise 50%. As mentioned in chapter 2.2, a duty cycle correction circuit is necessary to ensure that crystal oscillator can offer precise 50% duty cycle, generating exactly frequency doubling at output.



Figure 3.7 Ideal waveform of  $V_s$ , CLK, NCLK and  $V_{out}$ 

#### (2) Accuracy of the 90° delay cell

A delay-locked loop is needed as well. Since a crystal oscillator produce a single-end output clock, a delay-locked loop is necessary to derive another two clock CLK and NCLK from xtal oscillator. Looking at figure 3.7, the accuracy of edge 9 and 10 are determined by edge 5, 6, 7, and 8. Meanwhile in order to keep 50% duty cycle rate of the output clock, edge 9 and 10 must be precise which requires edges 5, 6, 7 and 8 accurate enough demanding a delay-locked loop to make CLK precisely 90° early than V<sub>s</sub> and NCLK precisely 90° late than V<sub>s</sub>. Therefore, a delay-locked loop is needed to make sure the output clock keep precise 50% duty cycle rate.

#### (3) Power dissipation consideration

As discussed before, the power dissipation of the frequency doubler circuit is determined by the DC bias current source. The more power, the less phase noise can be achieved. For example, the phase noise floor is about -155.3dBc/Hz @ 0.6mW when the total tail current is 300uA biased on per stage with the supply voltage= 1.2V as shown in figure 3.6 (c), which is not bad at first look. But compared with the digital method in next section, this little power consumption can be even saved to obtain the same phase noise.

According to the previous discussions, a larger differential input swing and a steeper SR are needed to have less phase noise. This point might inspire us is that possible to double the frequency using a digital logic circuit, which will be introduced in next section.

# 3.2 Digital Method

Considering the limitations of the analog method in [2], a digital method is introduced to achieve frequency doubling in a more power efficient way.

### 3.2.1 Idea

The core idea of a digital method is to use an XOR-gate to realize the frequency doubling. Looking at figure 3.8,  $V_{clk+90^{\circ}}$  has 90° delay with respect to  $V_{clk}$  and the frequency of  $V_{out}$  is doubled as that of  $V_{clk}$ . Considering  $V_{clk}$  and  $V_{clk+90^{\circ}}$  are two inputs of the digital circuit and  $V_{out}$  is its output, then the truth table of the related digital circuit can be obtained as shown in table 3.3 in which  $V_{out}$  is "1" when  $V_{clk}$  and  $V_{clk+90^{\circ}}$  are different and  $V_{out}$  is "0" when  $V_{clk}$  and  $V_{clk+90^{\circ}}$  are the same voltage. This is exactly the characteristic of an XOR-gate. Therefore, an XOR-gate can be used to do frequency doubling.



Figure 3.8 Core idea of digital method of frequency doubling

| V <sub>clk</sub> | V <sub>clk+90°</sub> | V <sub>out</sub> |
|------------------|----------------------|------------------|
| 0                | 0                    | 0                |
| 0                | 1                    | 1                |
| 1                | 0                    | 1                |
| 1                | 1                    | 0                |

Table 3.3 Truth table of XOR-gate.

### **3.2.2 Circuit Implementation**



Figure 3.9 A typical topology of XOR-gate

Figure 3.9 shows a typical topology of an XOR-gate [5], in which B and A are two input clocks and V<sub>out</sub> is the output clock. According to this topology, an XOR-gate is dimensioned as seen in figure 3.11. There are twelve transistors in total, M1-M12. M1-M8 are identical to the transistors in figure 3.9. M9, M10 and M11, M12 forms two inverters to generate inversion input of A and B. All the transistor are scaled with minimum length 0.13um and width is 2um. The simulation result is shown in figure 3.12 (a) (b) (c). Input B and A are ideal square wave clock at 20Mhz and A is 90° later than B (actually the clock B and the clock A are corresponding to the input clock V<sub>clk</sub> and V<sub>clk+90°</sub>). V<sub>out</sub> is the output of the XOR-gate. From figure 3.12 (c), we can tell the frequency of V<sub>out</sub> is double of that of B and A. Therefore, it proves the feasibility of frequency doubling using XOR-gate.

Since the crystal oscillator only offers a single-ended output clock, a 90° delay cell is needed to derive a second clock A making it have a 90° delay than the clock B. Figure 3.10 shows the entire architecture of the frequency doubler using the digital method.



Figure 3.10 The entire architecture of the frequency doubler using the digital method



Figure 3.11 A practical schematic of an XOR-gate






Periodic Steady State Response

Figure 3.12 (b) Simulation result of the input clock A



Figure 3.12 (c) Simulation result of the output clock  $V_{out}$ 

### **3.2.3 Performance Analysis**

Similar as for the analog method, phase noise and power dissipation are used to evaluate this digital frequency doubler. Again, duty cycle rate is skipped because it is spoiled after frequency doubling because of the different transition times between NMOS transistors and PMOS transistors and a DCC will correct it afterwards.

### (1) Phase noise

Using the same equation (2.3) as for the analog method analysis, the phase noise floor can be calculated when doing time-domain Pnoise analyses with cadence to get the values of  $V_n$ and SR. Figure 3.13 (b) shows an example of a simulation waveform of  $V_n$  with the same configuration of the frequency doubler as shown in figure 3.11. The SR can be also easily measured as seen in figure 3.13 (a). So, based on the figures showed in figure 3.13 (b), the phase noise floor can be calculated as

L= 2\*  $(3.14)^2$ \*40E6\*((4.99E-3)/(4.2E10)) <sup>2</sup>~ -168.31 dBc/Hz at 40MHz, which is 10dB lower costing less than 2uW as shown in figure 3.13 (C) than that of the analog method at the cost of 0.6mW.



Figure 3.13 (a) Simulation waveform of the output clock  $V_{\mbox{\scriptsize out}}$ 



-Integrated Noise Vn (V) of related middle point of the edge (b)





Figure 3.13 (c) Simulation waveform of the total current consumption I total using an XOR-gate

Using wider of transistors can reduce the integrated voltage noise  $V_n$ . To explain it, let's take a look of the integral voltage noise summary in table 3.4 using the same schematic configuration of figure 3.11. From the noise summary, it is clear to see, at the rising edge 12.528ns, M5, M3, M1, M8 and M7 are the dominant noise contributor, at the falling edge 25.045ns, M12, M8, M1, M5, and M3 are the dominant noise contributor and at next rising edge 37.568ns, M7, M6, M3, M9 and M1 are the dominant noise contributor. Increasing the width of transistors can reduce the integrated voltage noise  $V_n$  as seen in table 3.5.

| Rising edge at timeindex= 12.528ns |      |                        |            |  |
|------------------------------------|------|------------------------|------------|--|
| Device                             | Para | Noise<br>contribution  | % of total |  |
| M5                                 | id   | 3.53E-6 V <sup>2</sup> | 26.33%     |  |
| M3                                 | id   | 2.56E-6 V <sup>2</sup> | 19.11%     |  |
| M1                                 | id   | 2.39E-6 V <sup>2</sup> | 17.8%      |  |
| M8                                 | id   | 1.95E-6 V <sup>2</sup> | 14.58%     |  |
| M7                                 | id   | 1.32E-6 V <sup>2</sup> | 9.83%      |  |
| M4                                 | fn   | 7.12E-7 V <sup>2</sup> | 5.31%      |  |

| M5  | id | 3.02E-7 V <sup>2</sup> | 2.25% |
|-----|----|------------------------|-------|
| M2  | id | 2.54E-7 V <sup>2</sup> | 1.89% |
| M10 | id | 2.02E-7 V <sup>2</sup> | 1.51% |
| M9  | id | 8.46E-8 V <sup>2</sup> | 0.63% |

### Falling edge at timeindex= 25.045ns

| Device | Para | Noise<br>contribution   | % of total |
|--------|------|-------------------------|------------|
| M12    | id   | 5.95E-6 V <sup>2</sup>  | 27.13%     |
| M8     | id   | 4.43 E-6 V <sup>2</sup> | 20.2%      |
| M1     | id   | 4.15 E-6 V <sup>2</sup> | 18.94%     |
| M1     | fn   | 2.0 E-6 V <sup>2</sup>  | 9.12%      |
| M5     | id   | 1.29 E-6 V <sup>2</sup> | 5.87%      |
| M3     | id   | 1.18 E-6 V <sup>2</sup> | 5.36%      |
| M12    | fn   | 8.23 E-7 V <sup>2</sup> | 3.75%      |
| M11    | id   | 7.68 E-7 V <sup>2</sup> | 3.5%       |
| M6     | id   | 4.7 E-7 V <sup>2</sup>  | 2.15%      |
| M3     | id   | 3.98 E-7 V <sup>2</sup> | 1.81%      |

### Rising edge at timeindex= 37.568ns

| Device | Para | Noise<br>contribution  | % of total |
|--------|------|------------------------|------------|
| M7     | id   | 4.79E-6 V <sup>2</sup> | 40.32%     |
| M6     | id   | 3.51E-6 V <sup>2</sup> | 29.54%     |
| M3     | id   | 8.41E-7 V <sup>2</sup> | 7.08%      |
| M9     | id   | 7.1E-7 V <sup>2</sup>  | 5.97%      |
| M1     | Id   | 4.17E-7 V <sup>2</sup> | 3.5%       |
| M7     | fn   | 3.22E-7 V <sup>2</sup> | 2.71%      |
| M9     | fn   | 3.05E-7 V <sup>2</sup> | 2.56%      |
| M4     | id   | 2.38E-7 V <sup>2</sup> | 2.0%       |

|                                                                   | M10               | id                   | 2.31E-7                           | ' V <sup>2</sup> 1.9    | )4%                      |
|-------------------------------------------------------------------|-------------------|----------------------|-----------------------------------|-------------------------|--------------------------|
|                                                                   | M2                | id                   | 2.0E-7                            | V <sup>2</sup> 1.6      | 8%                       |
|                                                                   |                   | Table 3.4 integrated | l voltage noise V <sub>n</sub> no | bise summary            |                          |
|                                                                   | Width             |                      | at rising edge<br>12.5ns          | at falling edge<br>25ns | at rising edge<br>37.5ns |
|                                                                   | 2 um              | SR                   | 1.65E10                           | 3.42E10                 | 1.28E10                  |
| including all<br>transistor                                       | V <sub>n</sub>    | 3.66mV               | 4.68mV                            | 3.44mV                  |                          |
|                                                                   | phase noise floor | -164dBc/Hz           | -168dBc/Hz                        | -162.42dBc/Hz           |                          |
| 10um including                                                    | SR                | 3.26E10              | 1.92E10                           | 2.48E10                 |                          |
| M7, N                                                             | /18, M9, M10      | Vn                   | 1.43mV                            | 2.35mV                  | 2.91mV                   |
| а                                                                 | nd M12)           | phase noise floor    | -178.19dBc/Hz                     | -169.28dBc/Hz           | -169.64dBc/Hz            |
| 50um including<br>(M1, M3, M5, M6,<br>M7, M8, M9, M10<br>and M12) | SR                | 3.5E10               | 2.11E10                           | 1.2E10                  |                          |
|                                                                   | Vn                | 0.8mV                | 1.12mV                            | 1.89mV                  |                          |
|                                                                   | phase noise floor | -183.85dBc/Hz        | -176.53dBc/Hz                     | -167.09dBc/Hz           |                          |

Table 3.5 relation of  $V_{n}$  , SR, and phase noise floor

Looking at table 3.5, it is easy to find that with the increase of width of transistors (M1, M3, M5, M6, M7, M8, M9, M10 and M12) from 2um to 50um, the integrated voltage noise  $V_n$  goes down from 3.66mV to 0.8mV for rising edge 12.5ns, from 4.68mV to 1.12mV for falling edge 25ns, and from 3.44mV to 1.89mV for rising edge 37.5ns.

However, the SR does not show the same dependence as  $V_n$ . Due to this, when the phase noise floor reduces from -164dBc/Hz to -183dBc/Hz, for the rising edge 12.5ns, from -168 dBc/Hz to -176.53dBc/Hz for the falling edge 25ns, for the rising edge 37.5ns, the phase noise is decreased to -169.64dBc/Hz from -162.42dBc/Hz and then rises up to -167.09 dBc/Hz. Therefore, it is hard to derive a universal property to reduce phase noise as done in analog method.

### (2) Power dissipation

The digital method of frequency doubling requires only low power. This is easy to understand that the circuit is highly digital and there is current consumption only during the time of transistors are switching. As illustrated in figure 3.13 (c), There are only a few spikes which is completely different from the analog method in figure 3.6 (c). Therefore, another big advantage of the digital method is that the power cost is less than 2uW when doubling the frequency making less that -168dBC/Hz at 40MHz of the output frequency.

### 3.2.4 Limitations

Since the digital circuit already has maximum input swing and the power consumption is lower than 2uW, the XOR-gate approach has limitations in the precision of the 90° delay cell and accuracy of the duty cycle correction circuit.



Figure 3.14 Limitation of XOR-gate.

### (1) Precision of duty cycle correction circuit

A duty cycle correction circuit is necessary to make the input clock keep the perfect 50% duty cycle rate. The reason is similar to the case in analog method: to have the exactly frequency doubling, edge 3 and 4 must emerge on time; the accuracy of edges 3 and 4 are determined by edges 1 and 2 which depend on whether the input clock stays the perfect 50% duty cycle rate as seen in figure 3.14.

### (2) Accuracy of 90° delay cell

The precision of the 90° delay cell determines the accuracy of the 50% duty cycle rate of the frequency-doubled output clock and a DLL is needed to derive the second 90° delay clock since the crystal oscillator can only generates a single-ended output clock. Looking at figure 3.14, edges 7 and 8 decide whether the duty cycle rate of output clock is precise 50% or not. However, exact coming of edges 5 and 6 limits the precision of edges of 7 and 8 which requires edge 5 must be exactly 90° delayed than edge 1. Hence, the performance of the 90° delay cell also confines the quality of the XOR-gate frequency doubler.

# **3.3 Summary-- Comparison Between Analog Method and Digital Method**

After analyzing both of the analog method and the digital method, it is necessary to do a brief comparison between these two methods. They share some features (90° delay and DCC) and are different in phase noise floor, power consumption and area consideration. The comparison is condensed in table 3.5.

### 3.3.1 Same Properties

(1) A duty cycle correction circuit is necessary. Either with the analog method or the digital method, the duty cycle rate of frequency-doubled output clock is spoiled—it is not 50% anymore. Meanwhile, both of two methods require a precise 50% duty cycle rate square wave clock at input which means DCC is added to correct duty cycle rate of output clock of crystal oscillator. Therefore, duty cycle correction circuit is used to detect duty cycle error and correct it at the same time.

(2) An accurate 90° delay cell is mandatory. Since a crystal oscillator only has a single-ended output, a delay-locked loop is needed to derive a second 90° delay clock either in an analog way or in a digital way. The precision of the 90° delay cell determines the accuracy of the 50% duty cycle rate of the frequency-doubled output clock. Hence, the performance of the 90° delay cell confines the quality of the frequency doubler.

### **3.3.2 Difference**

### (1) Phase noise floor

The digital method has about 10dB lower phase noise than the analog method at the cost of less than 2uW instead of 0.6mW. To decrease phase noise further, the analog method mainly depends on using larger differential input swing and higher DC current for the two differential amplification stages. For the digital method, increasing the width of relevant transistors will result in less integrated voltage noise and will lead to less phase noise in most of cases.

### (2) Power dissipation

The previous discussion shows that the digital method is much more power-efficient than the analog method. It is factor 300 lower power consumption when making -168dBc/Hz at the cost of less than 2uW and analog method needs at least 0.6mW to achieve only - 155dBc/Hz. More power has to be cost to make less phase noise using analog method.

The power dissipation discussed here only takes the frequency doubler circuit into account and the power consumption of the duty cycle correction circuit and the 90° delay cell is not included. It will be considered in chapter 4 and chapter 5.

### (3) Area consideration

Comparing figure 3.4 and figure 3.11, we can find that the analog method costs more than 114um/0.13um active area to achieve -155dBc/Hz and digital method only needs 24um/0.13um to make -168dBc/Hz if we define the area as the product of summed widths of all transistors and transistor length. From this point of view, the digital method is more efficient in saving area to generate less phase noise.

|         | Phase noise | Power | Area W/L      | 90° delay | DCC    | Input<br>square<br>wave<br>50% |
|---------|-------------|-------|---------------|-----------|--------|--------------------------------|
| Analog  | -155dBc/Hz  | 0.6mW | >114um/0.13um | Needed    | Needed | Needed                         |
| Digital | -168dBc/Hz  | <2uW  | 24um/0.13um   | Needed    | Needed | Needed                         |

Table 3.6 Comparison between analog method and digital method

\*Area: the ratio between the sum of the width of all the transistors and length.

\*Power dissipation: only frequency doubler, without considering DCC and DLL

# Chapter 4 90° Delay Cell

According to discussion in chapter 3, a precise 90° delay cell is essential both for the analog circuit and for the digital circuit (Figure 4.1 shows where the 90° delay cell is used in the digital frequency doubling circuit ). A delay cell is usually made by a delay-locked loop (DLL) [7][8]; a short overview of DLL will be given at first in this chapter. Then the actual implementation of the 90° delay cell is presented.



Figure 4.1 The entire architecture of of the frequency doubler using the digital method

## 4.1 Overview of the DLL

Many applications require accurate positioning of the phase of a clock or of a data signal. Although simply delaying the signal could shift the phase, the phase shift is not robust to variations in processing, voltage, or temperature. For more precise control, designers incorporate the phase shift into a feedback loop that locks the output phase to an input reference signal that indicates the desired phase shift. In essence, the loop is identical to a phase-locked loop (PLL) [9] except that phase is the only state variable and that a variable-delay line replaces the oscillator. Such a loop is commonly referred to as a delay-line phase-locked loop or delay-locked loop (DLL). As with a PLL, the goals are (1) accurate phase position or low static-phase offset, and (2) low phase noise or jitter.



Figure 4.2 Classical architecture of DLL

The basic loop building blocks are similar to that of a PLL: a phase detector, a filter, and variable-delay line. Figure 4.2 shows a classical architecture of a DLL in which  $CK_1$  works as

the input reference clock driving the delay line that consists of a number of cascaded variable delay elements. The output clock  $CK_2$  drives the loop phase detector. The output of the phase detector is integrated by the charge pump and loop filter capacitor to generate the loop control voltage V<sub>c</sub>. In figure 4.3, the commonly used schematic of phase detector and charge pump with a low pass filter (LPF) is shown. The phase detector is implemented with two edge-triggered, resettable D flip-flops with their D inputs tied to a logical One. A charge pump (LPF) consists of two switched current sources that pump charge into or out of the loop filter according to two logical inputs Q<sub>A</sub> and Q<sub>B</sub>.



Figure 4.3 Common-used schematic of phase detector and charge pump.



Figure 4.4 Waveform of phase detector with its charge pump

To better understand how a DLL works, let's look at the waveform of the phase detector and the charge pump as seen in figure 4.4. When  $CK_2$  is a little later than  $CK_1$ ,  $Q_A$  will go to a high

voltage and keep it. When  $CK_2$  is becoming logical One,  $Q_B$  goes to a high voltage and then the output of the AND gate goes high to reset both flip-flops. Then  $Q_A$  and  $Q_B$  go to logical Zero again. In other words,  $Q_A$  are high for a short time  $\Delta t$ . During this period  $\Delta t$ , S1 is turned on and  $I_1$  is charging capacitor  $C_p$  generating voltage  $V_p$  which controls the delay element to make  $CK_2$  move faster. Then after a couple of cycles,  $CK_1$  and  $CK_2$  will share the same phase and  $Q_A = Q_B = 0$ , both S1 and S2 are open, neither  $I_1$  charging nor  $I_2$  discharging  $C_{P_1}$  and  $V_p$ remains constant. The desired  $V_{out}$  can be derived from the output of related delay element. Since the DLL is a big topic, for more information, readers may look into relative papers [7], [8] and [10] which will be skipped in this report.

### 4.2 Implementation of the 90° Delay Cell

### 4.2.1 Delay Element

In this project, the individual delay element consists of two inverters. The first inverter starved with two current sources that offer the same amount of current is used to make related delay as shown in figure 4.5 (a) in which M1 and M2 work as two switches controlled by input and M3 and M4 are two current sources with the same DC bias current. Figure 4.5 (b) shows the desired output waveform of a single delay element, the timing of whose rising edge is controlled by the current of transistor M3 which is determined by the its gate voltage  $V_{cn}$  and in the same way,  $V_{cp}$  determines the timing of the falling edge. A second inverter is followed to increase the slew rate of the delayed clock.



Figure 4.5 Individual delay element

To quantify the delay, let us look at figure 4.6 (a) (b). The ideal model of the delay element is shown in figure 4.6 (a) where C means the total output capacitance of the delay element;  $V_C$ 

represents the voltage across capacitor C. Assuming switch M1 is turned on,  $I_1$  is charging capacitor C. But it needs a certain time to charge capacitor C from a logical Zero to a logical One. Equation (4.1) indicates the relation between V<sub>c</sub>, C, T, and current I<sub>s</sub> in which T is the time delay. By changing parameter I<sub>s</sub> and C, the corresponding delay T can be achieved.



Figure 4.6 Ideal model of a single delay element

### 4.2.2 Implementation of the 90° Delay Cell.

The limitation of the conventional DLL is that only the rising edge is phase detected and tuned to come at the same time as that of input reference clock but the falling edge is out of concern as shown in figure 4.3 and in figure 4.4. However, both the analog method and the digital method of frequency doubling require the falling edge should be as precise 90° delay as the rising edge does. Therefore, a novel delay-locked loop is proposed in this project.

It consists of two phase detectors and charge pumps and forms two loops to make precise the 90° delay, one for the rising edge and one for the falling edge as illustrated in figure 4.7. Since only the rising edge is phase detected by the conventional DLL, an inverter is added to transform the falling edge into a rising edge followed by a second phase detector and charge pump. Then the falling edge can also be phase detected and can be as accurate 90° delay as rising edge as indicated in figure 4.7.



Figure 4.7 The architecture of DLL with two PFD and CP

Figure 4.8 (a) (b) shows the simulation waveforms of the input clock and the 90° delay output clock. We can see that the rising edge has 10.7ps delay error, and the falling edge has 14.5ps delay error.

Figure 4.9 shows the total current consumption of the 90° delay cell. It is about 600uA. The power dissipation of 90° delay cell is about 0.72mW which is cost mainly by the delay elements.



Figure 4.8 Simulation waveform of (a) the input clock (b) the 90° delay output clock



Figure 4.9 Simulation waveform of the total current consumption of DLL

### 4.2.3 Limitations

The accuracy of the 90° delay of both the rising edge and the falling edge is limited by the performance of the charge pump of the DLL as shown in figure 4.7. The limitations are (1) channel charge injection, (2) voltage compliance, (3) random mismatch between Up and Down current ( $I_1$  and  $I_2$ ), and (4) channel-length modulation.

### (1) Channel charge injection

Channel charge injection creates an offset voltage across the capacitor C and eventually makes a phase offset for the 90° delay. Figure 4.10 (a) shows an ideal charge pump and the practical one is shown in figure 4.10 (b) in which the transistors M1 and M2 work as the switches S1 and S2, and M3 and M4 work as the current source  $I_2$  and  $I_1$ . The switching transistors M1 and M2 carry a certain amount of mobile charge in the inversion layers when they are on. When they turn off, this mobile charge will inject into the capacitor C and makes an undesired offset voltage  $\Delta V_c$ . Then  $\Delta V_c$  works on the delay element and creates a phase offset at the output. The precise 90° delay is spoiled. To reduce channel charge injection, the smaller size of the switches are preferred.



Figure 4.10 (a) The schematic of an Ideal charge pump (b) The schematic of a practical charge pump

### (2) Voltage compliance

The output compliance of the voltage across the capacitor C is limited. As shown in figure 4.10 (b), each current source M3 and M4 requires a minimum drain-source voltage and each switch M1 and M2 sustains a voltage drop. Then the actual output compliance is equal to supply voltage minus two overdrive voltages and two switch drops. Due to this limitation, the accuracy of the 90° delay will be compromised.

To have larger output compliance, the larger transistors should be used. But it will boost up the channel charge injection.

### (3) Random mismatch between Up and Down current

In practical circuit, the current sources M3 and M4 inevitably suffer from random mismatches, which means the current of  $I_1$  and  $I_2$  are not fully equal to each other. The difference between  $I_1$  and  $I_2$  will generate an offset voltage across the capacitor C. Because of this offset voltage, a phase offset is produced at the output.

The random mismatches between Up and Down currents can be reduced by enlarging the current-source transistors. Recall from analog design that as the device area increases, mismatches experience greater spatial averaging. However, the larger size of transistors suffer from a greater amount of channel charge injection.

### (4) Channel-length modulation

The Up and Down currents also incur mismatch due to channel-length modulation of the current source. The different voltages across the capacitor C lead to the opposite changes in the drain-source voltages of the current sources, thereby creating a larger mismatch. As mentioned as before, this mismatch will make a phase offset at the 90° delay output clock.

To have less channel-length modulation, the larger length of the current source transistor is needed. However, it will bring about more channel charge injection effect.

# 4.3 Summary

Chapter 4 discusses about a simply realization of a 90° delay cell. First it gives a short introduction about the delay-locked loop including the classical architecture of DLL and a commonly-used phase detector and a charge pump. Then it implements the DLL with two phase detectors and charge pumps—one is for rising edge and the other is for falling edge with an inverter starved with two current sources working as an individual delay element. The limitation of accuracy of the 90° delay is discussed as followed. They are the channel charge injection, the voltage compliance, the random mismatch between Up and Down current source, and the channel-length of modulation of current sources. Since DLL is a big topic, it cannot be fully covered in this report. This chapter just proposes a simple way to make sure both the rising edge and the falling edge have a precise 90° delay.

# **Chapter 5 Duty Cycle Correction Circuit (DCC)**

As discussed before, the duty cycle rate is spoiled—it is not 50% anymore either after frequency doubling (either in an analog way or a digital way) or converting the output of a crystal oscillator from a sine wave to a square wave because of the different transition times between NMOS transistors and PMOS transistors. Therefore there is a need to reduce this error by means of a detection and a correction circuit. Figure 5.1 shows where the duty cycle correction circuit is used in the digital frequency doubling circuit.



Figure 5.1 The entire architecture of the frequency doubler using the digital method

## 5.1 Idea

There are some papers talking about the duty cycle correction circuit in [10-12]. But the essential idea of these circuits is almost the same to find a way to detect the error and with which to compensate the input reference clock with a feedback loop.

The method to detect the duty cycle error is to measure the two time periods,  $T_1$  and  $T_2$ , whose difference defines the duty cycle error. To be able to make a distinction between the two time periods, the waveform is divided such that the time period  $T_1$  represents an ' on' state and the time period  $T_2$  represents an ' off' state as shown in figure 5.2.



Figure 5.2 A clock waveform with duty cycle error

Since the difference between the two time periods is the error of interest, there has to be a way to compare them. This points to the need of a memory element which stores the time information of  $T_1$  and compares it with the time information of  $T_2$ . A straightforward approach is to charge a capacitor during the ' on ' state and discharge it during the ' off' state, which is illustrated in figure 5.3, in which V represents the voltage across the capacitor. In time ' $T_1$ ', V is charging into V(t= $T_1$ )=I<sub>s</sub> $T_1/C$ . And in time ' $T_2$ ', V is discharging into

 $V(t=T_2)=V_1-I_sT_2/C$ ,  $I_s$  is the charging or discharging current. Then after a cycle the average voltage across the capacitor,  $\Delta V_{error} = I_s \Delta t/C$ , can represent the duty cycle error  $\Delta t$ . Compensating the input reference clock with this average error voltage using a feedback loop will correct the duty cycle rate and make it approach to 50%.



Figure 5.3 Measuring duty cycle error

### **5.2 Circuit Implementation**

#### **5.2.1 Error Detection**



Figure 5.4 Error detection circuit

The detection of duty cycle error can be realized using the schematic shown in figure 5.4. Transistors M11-M14 are working as four switches. The basic operation is to charge and discharge capacitor C, during  $T_{1,}$  respectively  $T_2$  (Fig. 5.3). During time period  $T_{1,}$  M11 and M14 are turned on, whereas M12 and M13 are turned off, generating a positive voltage

across capacitor C. During time period  $T_{2,}$  the process is turned around, at which the charge build up is negative. After time period  $T_{2,}$  a rest voltage is presented across capacitor C representing the time difference between  $T_1$  and  $T_2$ .

### **5.2.2 Error Correction**

As mentioned before, the error correction is done by taking the average voltage across the capacitor C and then compensating it with the input reference clock. Figure 5.5 shows the error correction circuit when changing a sine wave into a square wave with the perfect 50% duty cycle rate. A subtractor is connected across the capacitor C to convert the differential output signal of capacitor C into a single-ended signal. After passing through a low pass filter, the average voltage across capacitor C will be obtained. Then adding this average voltage with the input reference voltage V<sub>ref</sub>, it generates a new reference voltage for the comparator. After comparing with the input signal—the sine wave (can be also a square wave ), the duty cycle rate of the output comparator will be improved approaching to 50%.



Figure 5.5 The error correction circuit

### **5.2.3 Entire Duty Correction Circuit**

Combining the error detection circuit and the correction circuit, the entire duty cycle correction circuit is derived as indicated in figure 5.6. Assuming an ideal sine wave as the input of comparator, the output of comparator is a square wave with a spoiled duty cycle rate due to the different transition times between NMOS transistors PMOS transistors in comparator. Then this spoiled square wave passes through the error detection circuit creating the error voltage across capacitor C. After subtracting, this error voltage goes into low pass filter to produce the average error voltage. Adding this average error voltage with original reference voltage  $V_{ref}$ , a new reference voltage is created. With this new reference voltage, the duty cycle rate of output square wave will be modified and approach to 50%.

Figure 5.7 shows a practical schematic of entire duty correction circuit. M1, M2, M3, M4, M5, M6, M7, M8 and M9 forms a comparator in which, M1, M2, M3, M4, and M5 makes a differential pair loaded with an active current mirror as the first amplification stage, M7 creates a common source stage [13] with a current source M6 as a second amplification stage, and M8 and M9 develops an inverter to increase the slew rate of the output signal.



Figure 5.6 Complete duty cycle correction circuit

The error detection circuit is made by transistors M10-M14, in which M10 acts as a tail current source. Because the input of the error detection circuit is a differential signal, an inverter made by transistor M16 and M17 is inserted between the output of comparator and the negative input of the error correction circuit. However, this inverter might cause some unexpected delay because of the transition time. To compensate this delay, a transmission gate made by transistors M15 and M16 is put in between the output of comparator and the positive input of the error detection circuit. C1, R1, an ideal voltage control voltage source (VCVS ) E0, and the DC voltage V<sub>ref</sub> (V<sub>dc</sub>=0.6V) compose of the error correction circuit. Due to limited time, the subtractor is using an ideal voltage control voltage source instead of a practical subtractor in this project. The adder is simply implemented with cascading a DC voltage source V<sub>ref</sub> at the output of low pass filter.

Following the configuration as shown in figure 5.7, a transient simulation is done to check the validity of the duty cycle correction circuit when modeling  $V_s$  as an ideal sine wave (exact the output signal of a crystal oscillator ) with perfect 50% duty cycle rate as an input signal at 20MHz. The simulation result is shown in figure 5.8 (a) (b) (c) (d) (e) (f). Looking at figure 5.8 (a), it is clear to see, after duty cycle correcting, there is still about 17.2ps time error which is about 0.0344% duty cycle error at 20MHz. The tail current charges and discharges capacitor C is shown in figure 5.8 (b) which shows the capacitor acts as a pure integrator and spikes are caused by the channel charge injection (will be explained in the following section ). The output of the subtractor, the low pass filter, and the reference voltage for the comparator and total current consumption (power) are respectively shown in figure 5.8 (c) (d) (e) (f).



Figure 5.7 The schematic of a practical duty cycle correction circuit



Figure 5.8 (a) Simulation waveform of the output signal  $V_{\mbox{\scriptsize out}}$  of DCC



Figure 5.8 (b) Simulation waveform of the positive input and the negative input of the subtractor



Figure 5.8 (c) Simulation waveform of the output signal  $V_{\mbox{\scriptsize out\_subractor}}$  of the subtractor



Transient Response

Figure 5.8 (d) Simulation waveform of the output signal of the low pass filter  $V_{\text{out\_LPF}}$ 



Figure 5.8 (e) Simulation waveform of the reference input signal  $V_{\mbox{\tiny ref}}$  of the comparator



Figure 5.8 (f) Simulation waveform of the total current consumption  $\rm I_{total}$  of the DCC

## **5.3 Performance Analysis**

According to the specifications proposed in chapter 1.3, the duty cycle rate, the phase noise floor and the power dissipation are the main parameters to qualify the circuit in this project. However, due to the limited time, phase noise floor and power dissipation of DCC are not specifically covered in this project. This section mainly focuses on the duty cycle rate.

### 5.3.1 Duty Cycle Rate

The quality of the duty cycle correction circuit is, mainly determined by three factors—the channel charge injection, the cutoff frequency of the low pass filter and the tail current source  $I_{S}$ .

### (1) Channel charge injection

The channel charge injection is the dominant error source to affect the performance of the duty cycle correction. Recalling the principle of the duty cycle detection that is charging and discharging capacitor C by controlling switches M11-M14 as indicated in figure 5.4, the channel charge injection will always exist on the capacitor C and creates a static error. It is usually quantified by equation (5.1)

$$Q_{th} = WL C_{ox}(V_{DD} - V_{in} - V_{th})$$

Using the minimal size of switches is the straightforward way to reduce the channel charge injection and furthermore to have less duty cycle error. Table 5.1 and figure 5.9 show the simulation results (with the same configuration as in figure 5.7 except for the size of switches) of relation between the size of switches (M11-M14) and the duty cycle error. It is clear that time error increases with the size of switches, and related duty cycle error is spoiled as well.

(5.1)

| Width (um) | Length (um) | Time error<br>( absolute value ) | Duty cycle error |
|------------|-------------|----------------------------------|------------------|
| 0.2        | 0.13        | 17ps                             | 0.034%           |
| 0.6        | 0.13        | 23ps                             | 0.046%           |
| 1.2        | 0.13        | 16.4ps                           | 0.0328%          |
| 2.4        | 0.13        | 16.4ps                           | 0.0328%          |
| 4.8        | 0.13        | 82.9ps                           | 0.1658%          |
| 9.6        | 0.13        | 583.8ps                          | 1.1676%          |
| 19.2       | 0.13        | 1477.9ps                         | 2.9558%          |
| 38.4       | 0.13        | 2189.7ps                         | 4.3794%          |

| 76.8 | 0.13 | 1872.1ps | 3.7442% |
|------|------|----------|---------|
| 100  | 0.13 | 1679.8ps | 3.3596% |

Table 5.1 Simulation result of relation between the size of switches and duty cycle error



Figure 5.9 Simulation result of the relation between the duty cycle error and the width of the switches

In reality, the fraction of charge that exits through the source and drain terminals as illustrated in figure 5.10, is a relatively complex function of various parameters such as the impedance seen at each terminal to ground and the transition time of the clock [14-15]. Investigation of this effect have not yielded any rule of thumb that can predict the charge splitting in terms of such parameters.



Figure 5.10 Charge injection when a switch turns off

#### (2) Tail current source I<sub>s</sub>

A larger tail current  $I_s$  enhances the precision of duty cycle correction, which is easy to understand. Using the concept of SNR—the signal is the voltage integrated across the

capacitor C by the tail current  $I_s$  as shown in figure 5.7 and the noise is the channel charge injection, larger  $I_s$  will accumulate more voltage across capacitor C then the ratio of the signal is increasing but the channel charge injection is the same, which results in a higher SNR. In other words, the effect of the channel charge injection is relaxed. Therefore, the precision of duty cycle correction is improved by using larger tail current  $I_s$ .

Table 5.2 shows the relation between the tail current source and the duty cycle error through the simulation with the same configuration as used in figure 5.7 except for  $I_{s.}$  Looking at simulation results, it shows that the improvement of the duty cycle correction by increasing the tail current is limited, which means channel charge injection is still dominant contributor to spoil the quality of the duty cycle correction circuit.

| I <sub>s</sub> (uA) | C (pF) | Time error | Duty cycle error |
|---------------------|--------|------------|------------------|
| 1                   | 1      | 28.9ps     | 0.0578%          |
| 3                   | 1      | 21.2ps     | 0.0424%          |
| 6                   | 1      | 18.5ps     | 0.037%           |
| 20                  | 1      | 17.2ps     | 0.0344%          |

Table 5.2 Relation between the tail current source  $\mathsf{I}_{\mathsf{S}}$  and the duty cycle error

### (3) The cutoff frequency of the low pass filter

The cutoff frequency of the low pass filter influences the accuracy of the duty cycle correction. When the frequency of the input signal is much higher than the cutoff frequency of the low pass filter, then the output of the low pass filter is the DC component of this input signal. This is the reason why a low pass filter is used in the DCC to take the average voltage across the capacitor C as shown in figure 5.6. The smaller cutoff frequency is, the more precise the DC component can be picked up and then the duty cycle rate is more close to 50%.

### 5.3.2 Phase Noise

The analysis of the phase noise floor of the duty cycle correction circuit is complicated. And it is necessary to derive the whole transfer function of the entire loop. Due to the limited time, this part is not covered in this project. But two simulation waveforms will be presented in the following part to show the complexity of analyzing the phase noise of the DCC.

Based on the configuration in figure 5.7 (except for R1=200K, C1=50pF, and C=10pF), we can obtain the integrated voltage noise  $V_n$ : the rising edge is 83.266mV and the falling edge in figure is 84.019mV as shown in figure 5.11. Then the phase noise floor of these two edges can be calculated as -132dBc/Hz and -132.2dBc/Hz at 40MHz, which is about 20dB larger than -151dBc/Hz of the required specification. Table 5.3 shows the integrated voltage noise summary of these two edges.



Periodic Steady State Response



| Device | Para         | Noise contribution (V <sup>2</sup> ) | % of total |
|--------|--------------|--------------------------------------|------------|
| M2     | id           | 0.00197                              | 28.47%     |
| M1     | id           | 0.00179                              | 25.88%     |
| M4     | id           | 0.000832                             | 12%        |
| M2     | fn           | 0.000521                             | 7.51%      |
| M1     | fn           | 0.000505                             | 7.28%      |
| M3     | id           | 0.000485                             | 6.99%      |
|        | At falling e | dge timeindex = 25.199ns             |            |
| Device | Para         | Noise contribution (V <sup>2</sup> ) | % of total |
| M2     | id           | 0.001927                             | 27.29%     |

### At rising edge timeindex = 0.17996ns

| M1 | id | 0.001878 | 26.59% |
|----|----|----------|--------|
| M4 | id | 0.000829 | 11.74% |
| M2 | fn | 0.000584 | 8.26%  |
| M1 | fn | 0.000529 | 7.48%  |
| M3 | id | 0.000474 | 6.71%  |

Table 5.3 Integrated voltage noise summary of the rising edge and the falling edge

Looking at table 5.3, it is clear that the transistors M1-M4 of the comparator are the dominant noise sources. So to have less phase noise, the noise contribution of the comparator should be decreased.

An ideal comparator is used to replace of the practical comparator composed of the transistors M1-M9, which means the noise caused by the comparator is zero. Then theoretically, both of integrated voltage noise  $V_n$  and the phase noise should be decreased according to the noise summary in table 5.3. However, the simulation waveform contradicts as shown in figure 5.12. The  $V_n$  of the rising edge is 89.393mV and the falling edge is 89.382mV which are even larger than that of a practical comparator. It is really wired. To explain this, it is necessary to derive the whole transfer function of the entire loop to see which component is actually the dominant noise source and to modify it to have less phase noise at the output.



Figure 5.12 Simulation waveform of  $V_{out}$  and  $V_n$  with an ideal comparator

### **5.3.3 Power Dissipation**

The power dissipation of the duty cycle correction circuit is mainly determined by the DC bias current of the comparator. It is proved by the simulation waveform as shown in Figure 5.13 (a) and (b) ( the simulation is done with the same configuration as indicated in figure 5.7 and the DC bias current of the comparator is 1mA). Figure 5.13 (a) shows that the actual average total current consumption of the DCC I<sub>total</sub> which is almost the same as the current consumption of the comparator as shown in figure 5.13 (b).



Figure 5.13 (a) Simulation waveform of the total current consumption I\_total of the DCC

(b) Simulation waveform of the current consumption of the comparator I\_comparator

As the same as the discussion in the frequency doubling circuit, the amount of the DC bias current of the comparator is related to the noise consideration. More current is consumed, less noise will be made. Therefore, it is necessary to take the noise specification into account together when considering the power dissipation of the DCC.

# **5.4 Limitations**

As well as other circuits, the duty cycle correction circuit also has limitations.

### 5.4.1 The Size of Switch

The minimal size of switches is determined by the technology itself. As discussed before, smaller switch generates less channel charge injection and the minimal switches should be used. However, the smaller transistor means larger parasitic on-resistance bringing about more thermal noise, which will be converted into phase noise in the end. Therefore, there must be a tradeoff about the size of switches between phase noise floor and channel charge injection effect.

### 5.4.2 Tail Current Source and Capacitor C

A larger voltage swing across the capacitor might force the tail current source enter into triode region, then the tail current source does not keep the same current anymore, which will give rise to integrated voltage error( not desired ) across capacitor C, destroying the quality of the duty cycle correction circuit. As this voltage swing is determined by the value of  $I_s$  and capacitor C, there is a tradeoff between those two parameters.

# 5.5 Summary

Chapter 5 is another core content of this work. Both an analog and a digital methods require the duty cycle correction circuit to adjust the duty cycle rate. The basic idea is to detect the adjacent half period time error by charging and discharging a capacitor. Then adding the average voltage across this capacitor with an original reference input forms a new reference input signal. Comparing input clock with this new reference input signal will create a more precise 50% duty cycle rate output clock. The quality of correction is limited by the channel charge injection of the switches and the tail current source of the detection circuit. Due to limited time, another two essential specifications power consumption and phase noise floor are not specifically analyzed in this project. This could be done for the future work.

# **Chapter 6 Conclusion and Recommendations for Further Research**

In the five previous chapters, we discussed the several critical definitions of frequency doubler, architecture level analysis; two methods to do the frequency doubling—analog and digital; a simple realization of the 90° delay cell achieved by a DLL; and the duty cycle correction circuit to adjust the spoiled duty cycle rate. Here let us summarize the discussion and draw conclusions on all of the things obtained in previous chapters. And the recommendations for the further research is followed.

## 6.1 Summary

Chapter one starts with the motivation of this project—a higher reference frequency is needed for the frequency synthesizer to reduce the noise contribution. Therefore, there is a desire to double the reference frequency and at the same time to preserve the clean properties of crystals. After setting the project goal, the required specifications are presented—they are that phase noise floor less than -151dBc/Hz, power dissipation less than 4mW and precise 50% duty cycle rate. Then the solution directions are analyzed—two ways to do the frequency doubling, analog and digital. And this work focuses on the feasibility of the doubling circuit with the use of a delay element and an XOR-gate and use the correction circuit to adjust the duty cycle rate making it approach to 50%. The report outline is given in the end.

Chapter 2, introduces the definition of the edge (middle point), the duty cycle error consisting of two types of errors—one is the static error caused by the different transition times between NMOS transistors and PMOS transistors and one is the random error created by the random voltage noise at the middle point of the edge. Then the definition of jitter/ phase noise is presented and also the relation between the random voltage noise, jitter and phase noise. To remove the static error, a duty cycle correction circuit is necessary. And for phase noise, steeper slew rate and lower voltage noise is needed. In the second section of this chapter, we discussed the architecture level considerations of the doubler circuit to gain the complete architecture of the frequency doubler. In the end, a short introduction about the crystal oscillator is given.

Chapter 3 is the core chapter in which two methods of frequency doubling are explained. In an analog way, increasing the differential input swing or using a larger current of the tail current source of the differential pair will generate steeper slew rate and lower voltage noise to reduce the phase noise. An alternative digital method—using an XOR-gate is motivated. It turns out that the digital method makes less than -168.31dBc/Hz at the cost of less than 2uW compared with -155dBc/Hz made by the analog method at the cost of 0.6mW. However, both of the analog and the digital method require a precise 90° delay cell and a duty cycle correction circuit to adjust the spoiled duty cycle rate caused by different transition times between NMOS transistors and PMOS transistors.

Chapter 4 discusses about a simple realization of a 90° delay cell. First it gives a short introduction about the delay-locked loop including the classical architecture of the DLL and

commonly-used phase detector and the charge pump. Then it implements the DLL with two phase detectors and charge pumps—one is for the rising edge and the other is for the falling edge with an inverter loaded with two current sources working as an individual delay element. Since DLL is a big topic, it cannot fully be covered in this report. This chapter just proposes a simple way to make sure both rising edge and falling edge have a precise 90° delay.

Chapter 5 is another core content of this work. Both the analog and the digital methods require the duty cycle correction circuit to adjust the duty cycle rate. The basic idea is to detect the adjacent half period time error by charging and discharging a capacitor. Then adding the average voltage across this capacitor with an original reference input signal forms a new reference input. Comparing the input clock with this new reference input signal will generate a much more precise 50% duty cycle rate clock. The quality of correction is limited by the channel charge injection of the switches and the tail current source of the detection circuit. Due to limited time, another two essential specifications-- power consumption and phase noise floor are not specifically analyzed in this project. This could be done for the future work.

## 6.2 Conclusion

(1) Middle point defines the edge, only the voltage noise on the middle point of edge contributes phase noise, and noise on other points of the edge is out of concern.

(2) The duty cycle error is caused by different transition times between NMOS transistors and PMOS transistors. A duty cycle correction circuit is necessary to remove this static error.

(3) Larger slew rate and lower voltage noise of the edge, are two methods to reduce phase noise.

(4) The duty cycle correction circuit is needed at both crystal oscillator output and frequency doubler output.

(5) The digital method of frequency doubling is making less than -168dBc/Hz at the cost of less than 2uW comparing -155dBc/Hz at the cost of more than 0.6mW done by the analog method. (The power consumption of the DLL and the DCC is not included. The power dissipation of the 90° delay cell is about 0.7mW. For the DCC, the power consumption is related to the requirement of the phase noise of the DCC and is uncertain yet.)

(7) A DLL consists of two phase detectors and charge pumps can make sure both rising edge and falling edge have a precise 90° delay.

(8) The channel charge injection is the main error source to limit the quality of the duty cycle correction circuit to adjust the duty cycle rate approaching to 50%. Using smaller switches will decrease the channel charge injection. However, it will lead to larger on-resistance of switches and generates more thermal noise (phase noise). There must be a trade-off between them ( need to find ).
## 6.3 Recommendations for Further Research

Due to the limited time, not everything is covered in this project. There are still some work could be done for the further research.

(1) Dummy switch

Since the precision of duty cycle correction depends on the channel charge injection effect, dummy switches can be an option to reduce this effect.

(2) Phase noise analysis of DCC.

Since DCC causes a lot of phase noise, it is necessary to derive the whole transfer function of the loop to check which components are dominant in making noise. Modify these components to have less phase noise.

(3) Power dissipation of DCC

The power dissipation of DCC is related to the noise requirement of DCC. It is better to consider them together.

(4) Implementation of the subtractor and the adder

The ideal VCVS is used as the subtractor in the duty cycle correction circuit. It is necessary to implement this ideal element with the practical circuit. And the ideal adder used in DCC is also needed to be implemented with the practical circuit.

(5) Implementation of overall doubler circuit.

After analyzing each separated block, it is necessary to combine these blocks together to check the phase noise floor, power dissipation, and duty cycle rate of the overall circuit. The architecture is shown as in figure 6.1.



Figure 6.1 Architecture of complete frequency doubler.

## Chapter 7 Comparison Between This Work and R. Oortgiesen's

After previous discussion, it is necessary to make a comparison between this work and R. Oortgiesen's since it is almost the same topic. Compared with his work, this project makes the extra following contributions.

(1) Definition of edge: Because a clock takes some transition time when it varies between a low voltage and a high voltage, a position of the edge must be picked up to define this edge and only the random noise working this position creates phase noise. The noise on other points is unnecessary to consider. Therefore, this definition is base of the whole project, which is completely missing in Oortgiesen's work.

(2) Phase noise: this work presents the methods to optimize phase noise by increasing larger slew rate and lowering voltage noise which are confirmed by the simulation both in the analog and the digital method. In his work, only theoretical calculation is included, and there is no simulation confirmation and optimization.

(3) Duty cycle correction circuit: the DCC is realized in this project and found that the channel charge injection is the main error source to limit the quality of correction which is proved with simulation results and this work suggests the approach to optimize this channel charge injection. Oortgiesen only proposed a theoretical idea of duty cycle correction circuit, no optimization and simulation is done at all.

(4) Power dissipation: again, optimization and simulation confirmation is included in this work which is missing in his work.

(5) Area consideration: since the area consumption is becoming more and more important in current IC design, it is covered in this work and Oortgiesen overlooked this point.

(6) 90° delay cell: Since there is only a single-ended clock output that a crystal can offer, a DLL is necessary to derive a second clock with precise 90° delay in digital method of frequency doubling. And in analog method, it even needs to derive a third clock with a minus 90° delay. This point is completely missing in Oortgiesen' work.

(7) The performance of the frequency doubler: the phase noise floor is -155dBc/Hz @ 40MHz at the cost of 0.6mW using the analog method proposed by R. Oortgiesen. However, in this work, the phase noise is -168dBc/Hz @40MHz at the cost of less than 2uW using the digital method which is 10dB lower phase noise and factor of 300 power dissipation saved .

Table 7.1 shows the condensed comparison between this work and Oortgiesen' work.

|                                          | R. Oortgiesen                                                 | this work                                                             |
|------------------------------------------|---------------------------------------------------------------|-----------------------------------------------------------------------|
| Definition of edge                       | None                                                          | Well-defined                                                          |
| Phase noise                              | Theoretical calculation                                       | Theoretical analysis,<br>simulation investigation and<br>optimization |
| Duty cycle correction                    | Theoretical idea                                              | Realization of DCC and performance analysis and optimization          |
| Power dissipation                        | Theoretical calculation                                       | Simulation investigated and optimized                                 |
| Area consideration                       | None                                                          | Included                                                              |
| 90° delay cell                           | None                                                          | A novel way ( two PFDs and CPs )                                      |
| The performance of the frequency doubler | Using the analog method<br>phase noise: -155dBc/Hz<br>@ 0.6mW | Using the digital method<br>phase noise: -168dBc/Hz<br>@ 2uW          |

Table 7.1 Comparison between this work and Oortgiesen' work

## Reference

[1] H. Huh, Y.Koo, K. . Lee, Y. Ok, S. Lee, D. Kwon, J. Lee, J. Park, K. Lee, D. . Jeong, and W. Kim. " Comparison Frequency Doubling and Charge Pump Matching Techniques for Dualband  $\Sigma\Delta$  Fractional-N Frequency Synthesizer". IEEE *Journal of Solid-State Circuits*, 40(11):2228-2235, 2005

[2] R. Oortgiesen "Feasibility Study of Frequency Doubling using a Dual-Edge Method". Master thesis University of Twente, 2010

[3] C.P. Lee, A. Behzad, B. Marholev, V. Magoon, I. Bhatti, D. Li, S. Bothra, A. Afsahi, D. Ojo, R. Roufoogaran, T. Li, Yuyu Chang, K.R. Rao, S. Au, P. Seetharam, K. Carter, J. Rael, M. Macintosh, B. Lee, M. Ro-fougaran, R. Rofougaran, A. Hadji-Abdolhamid, M. Nariman, S. Khor-ram, S. Anand, E. Chien, S. Wu, C. Barrett, Lijun Zhang, A. Zolfaghari, H. Darabi, A. Sarfaraz, B. Ibrahim, M. Gonikberg, M. Forbes, C. Fraser, L. Gutierrez, Y. Gonikberg, M. Ha\_zi, S. Mak, J. Castaneda, K. Kim, Zhenhua Liu, S. Bouras, K. Chien, V. Chandrasekhar, P. Chang, E. Li, and Zhimin Zhao. "A multistandard, multiband soc with integrated bt, fm, wlan radios and integrated power amplifier". pages 454 {455, feb. 2010. doi: 10.1109/ISSCC.2010.5433962.

[4] Ken Kundert. " Predicting the Phase Noise of Pll-based Frequency Synthesizers". Technical report, The Designers Guide Community, 2006

[5] Guillermo Gonzalez. "Foundation of Oscillator Circuit Design". Artech House, Inc, 2007

[6] K. Martin, "Digital Integrated Circuit", New York: Oxford University Press, 2000.

[7] Chih-Kong Ken Yang. " Delay-locked Loops – Overview"

[8] Stefanos Sidiropoulos, Mark. Horowitz, " A Semidigital Dual Delay-Locked Loop. IEEE Journal of Solid-State Circuit", VOL.32, NO. 11, November 1997

[9] Behzad Razavi, RF Microelectronics. Second Edition, Pearson 2011

[10] Thomas H. Lee, Kevin S. Donnelly, John T. C. Ho, Jared Zerbe, Mark G. Johnson, and Toru Ishikawa, "A 2.5 V CMOS Delay-locked Loop for an 18 Mbits, 500 Megabyte/s DRAM". IEEE Journal of Solid-State circuit, Vol. 29, No.12, December 1994

[11] Y.C. Jang, S.J. Bae and H.J. Park, "CMOS Digital Duty Cycle Correction Circuit for Multiphase Clock". Electronics Letters, Vol. 39 No. 19, September 18<sup>th</sup>, 2003.

[12] James S. Humble, Patrick J. Zabinski, Barry K. Gilbert, Erik S. Daniel, Mayo Clinic, Rochester, MN. " A Clock Duty-cycle Correction and Adujstment Circuit". ISSCC 2006 Session 28 Wireline Building Blocks 28.7

[13] Behzad Razavi, " Design of Analog CMOS Integrated Circuits", International Edition, McGraw Hill 2001.

[14] G. Wegmann, E. A. Vittoz, and F. Rahali, " Charge Injection in Analog MOS switches", IEEE J. Solid-State Circuits, vol. SC-22, pp. 1091-1097, Dec. 1987.

[15] B. J. Sheu and C. Hu, "Switch-Induced Error Voltage on a Switched Capacitor", IEEE J. Solid-State Circuits, vol. SC-19, pp. 519-525, April 1984.

## Acknowledge

First, I would like to thank Prof. Nauta for letting me do this project. And it was you and this outstanding group that made me decide to come to Netherlands and this university. I have a great time in here and also grow up a lot in the field of IC design. It is really an honor for me to study and do the research in this distinguished group.

Then I would like to thank my daily supervisors Dr. Anne-Johan Annema and Msc. Erik Olieman. Anne-Johan often dropped by my working place and discussed the project with me, which is really helpful for me to keep on the correct track. When I finished the report, he gave the good feedback even when he was suffering from the hurt of his back. Erik is the one who can help me out when I was lost in the mess.

I also give my special thanks to Dr. Eric Klumperink even though you are not involved in this project. Thanks for teaching me the courses Advanced Analog IC Electronics and Wireless Transceiver Electronics. Your critical thinking makes me learn the true essence of IC design. And the critical demand on the report writing from you helps me how to write a report in a scientific way. I am sure that these two valuable points will benefit me through the rest of my life.

Thanks Haifeng Ma, Wei Cheng, Zhiyu Ru and Harish for answering me the circuit design questions in the daily life and offering the suggestions for the future plan.

Thanks all the Chinese friends in Netherlands. It is you that make me feel nothing about lonely in this foreign country.

In the end, I would like to give all my thanks to my parents, to whom I have always been and will always feel grateful for having the major contribution in my well doing wherever in the world I be.