Amr - PHD Thesis UCLA 2012 High Frequency Multiphase Clock Generation Using Multipath Oscillators and Applications
Amr - PHD Thesis UCLA 2012 High Frequency Multiphase Clock Generation Using Multipath Oscillators and Applications
Title
High Frequency Multiphase Clock Generation Using Multipath Oscillators and Applications
Permalink
https://escholarship.org/uc/item/75g8j8jt
Author
Abou-El-Sonoun, Amr Amin
Publication Date
2012
Peer reviewed|Thesis/dissertation
Los Angeles
By
2012
© Copyright by
2012
ABSTRACT OF THE DISSERTATION
by
Oscillators and frequency dividers are core building blocks in communications systems and
processors used to provide proper synchronization for the flow of information. Different
variations of the conventional ring oscillator that involve coupling of different oscillator-stages
or different oscillators have been introduced to provide multiple-phases without penalizing the
oscillation frequency. These variations, known as multipath ring oscillators, enable system
ii
structures, however, introduce additional degrees of freedom and expand the design space
considerably which makes the process of designing them optimally a very difficult task.
This dissertation introduces an accurate analytical model and comprehensive analysis for
multipath ring oscillators and frequency dividers. The results of the analysis are incorporated into
an optimization algorithm that allows a designer to arrive at the desired optimal design at a very
short time. The analysis explains the factors that affect the different performance metrics
including the number of phases, oscillation frequency, phase noise, and oscillation-mode
stability.
multiphase sampling clock signals for the various stages of the serializer. The ability to generate
multiple clock phases at relatively high frequencies and low power cost allows significant power
iii
The dissertation of Amr Amin Abou-El-Sonoun is approved.
Milos Ercegovac
Dejan Markovic
Sudhakar Pamarti
2012
iv
To my parents
v
Table of Contents
1.2. Motivation .................................................................................................................... 3
viii
Acknowledgments
I would first like to thank my advisor, Professor Ken Yang, for the continuous help, support,
and guidance that he gave to me throughout this journey. I find it really hard to express my
gratitude for all the time he spent with me discussing my ideas, reviewing my papers, listening to
my design problems, and always being ready with advice and the right feedback.
I would also like to thank my dear friend and colleague, Ming-Shuan Chen, who without his
help and great effort, I would have never been able to get my chip out in time and get it back
working.
I spent very enjoyable three months in Maxim-IC in Hillsboro, OR working with a group of
very talented RFIC designers whom I learnt so much from. I would like to thank Scott Williams,
Joel Birkeland, Radu Fetche, Alexei Shatalov, Susan Tibbettes, and my dear friend Farhad
Farahbakhshian for this wonderful experience. I would also like to thank Prof. Robert Meyer of
UC Berkeley who I was lucky to work with and learn from during my internship at Maxim-IC.
I consider myself so lucky having got the chance to spend the Summer of 2010 working in
Broadcom with one of the best RFIC design teams in the world. It was a great experience
working with Hooman Darabi, Mohyee Mikhemar, Ahmad Mirzaei, and my friend, David
During my stay at UCLA, I was blessed to be among a group of wonderful people who made
my stay in Los Angeles a wonderful memory and who were always there for me, helping me get
over every technical and non-technical obstacle I have met. Thanks go to my dear friends, Ramy,
ix
Sameh, Tamer, Ismail, Omar, Aboudina, Henry, Wonho, Preeti, Michael, Wessam, Karam, Said,
brother, and my little sister for the unlimited love and support they gave to me throughout my
entire life without which I would have never been even close to where I am today.
x
VITA
2009 RFIC Design Intern, Maxim Integrated Products, Hillsboro, Oregon, USA
PUBLICATIONS
Amr A. Hafez, Ming-Shuan Chen, and C.-K. Ken Yang, “A Multi-Phase Multi-frequency
Clock Generator Using Superharmonic Injection Locked Multipath Ring Oscillators as
Frequency Dividers,” Accepted in IEEE Asian Solid State Circuits Conference (ASSCC), Nov.
2012.
Amr A. Hafez, C.-K. Ken Yang, “Analysis and Design of Superharmonic Injection-Locked
Multipath Ring Oscillators,” Submitted to IEEE Trans. Circuits and Systems—I: Regular
Papers.
xi
Amr A. Hafez, C.-K. Ken Yang, “Design and Optimization of Multipath Ring Oscillators,”
IEEE Trans. Circuits and Systems—I: Regular Papers, vol. 58, no. 10, Oct. 2011, pp. 2332-
2345.
David Murphy, Amr A. Hafez, Ahmad Mirzaei, Mohyee Mikhemar, Hooman Darabi, M.-C.
Frank Chang, Asad Abidi, “A Blocker-Tolerant Wideband Noise-Cancelling Receiver with a
2dB Noise Figure,” ISSCC, Feb. 2012, pp. 74-75.
Tamer A. Ali, Amr A. Hafez, Robert Drost, Ronald Ho, C.-K. Ken Yang, “A 4.6GHz MDLL
with -46dBc Reference Spur and Aperture Position Tuning,” ISSCC, Feb. 2011, pp. 466-467.
xii
CHAPTER 1
Introduction
Oscillators and frequency dividers are core building blocks in communication systems and
A multiphase clock generator is a core building block in many applications that allows a
Numerous examples of applications that use multiphase clock generators exist. Some are
discussed in Section 1.1. In such applications, the benefit of parallelism is largely based on the
assumption that multiphase generation can be achieved efficiently and with low overhead.
frequency divided outputs of an LC oscillator with high quality factor and low phase noise.
Complex structures known as multipath oscillators that involve coupling of multiple oscillators
and oscillator stages have been introduced in order to improve oscillator performance. These
structures, however, introduce additional degrees of freedom and expand the design space
considerably.
This dissertation introduces an analytical model and comprehensive analysis for multipath
ring oscillators and frequency dividers to aid designers in arriving at a near optimal design.
Section 1.2 of this chapter motivates this analysis by describing the design complexity of this
type of oscillators and dividers. The organization of the dissertation is then presented in Section
1.3.
1
1.1. Examples of Multiphase Clocking Applications
Any application that involves the interleaving of multiple paths requires multiple clock
phases. Common examples include interleaved data-converters and sub-rate clock and data
converter is used to achieve a sampling rate 4 times higher than that of a single unit converter.
These radios are expected to operate in different frequency-bands making it very expensive to
achieve band selection using off-chip filters. To prevent signals from out of the desired band
from interfering with the desired signal, a receiver has to use multiple signal paths driven by
In an application such as high-speed serial link transceivers, not only multiple phases but
multiple divided clock frequencies are usually needed to perform data serialization and
deserialization. These clock signals are typically generated using a chain of frequency-dividers
2
that is driven by the output of a high-frequency oscillator [5-9]. For serial links operating at tens
of Gb/s, these dividers consume a considerable amount of power to meet the target operation
frequencies over all process corners. In many cases, bandwidth extension techniques using
expensive passive inductors must also be used to meet the target speeds. This dissertation uses a
48 Gb/s serial link transmitter as a design example to illustrate the area and power benefits of
using injection-locked multipath oscillators to frequency divide and generate multiphase clocks.
1.2. Motivation
To generate a multiphase clock, a ring oscillator can be used in a phase-locked loop as shown
in Fig. 1.2(a). A conventional inverter-based ring oscillator, as shown in Fig. 1.3, consists of a
single loop of an odd number of inverters. While compact, easy to design, and tunable over a
wide frequency range, this oscillator suffers from several limitations. For example, it is not
possible to increase the number of phases while maintaining the same oscillation frequency since
the frequency is inversely proportional to the number of inverters in the loop. In other words, the
time resolution of the oscillator is limited to one inverter delay and cannot be improved below
this limit. In addition, the number of phases that can be obtained from this oscillator is limited to
odd values. Otherwise, if an even number of inverters is used, the circuit remains in a latched
To overcome the limitations of conventional ring oscillators, several variations have been
proposed. These variations all share the property that instead of having each phase in the
oscillator driven by a single inverter, or a single path, each phase can be driven by two or more
inverters, or multi-paths. Each of these paths is driven by a different phase in the oscillator [10-
3
19]. These variations are collectively known as multipath ring oscillators or MPROs as opposed
(a)
(b)
Fig. 1.2. Multiphase generation: (a) Using a multiphase ring oscillator. (b) Using an LC oscillator and a
multiphase frequency-divider.
In MPROs, the selection of which phase to use to drive each path and the relative sizing of
these different paths add additional degrees of freedom to the design of the oscillator. These
degrees of freedom, if properly used, can allow a designer to overcome the limitations of
conventional SPROs. However, these degrees of freedom also expand the design space of the
Previous MPRO analyses have been either limited to a qualitative argument or otherwise
restricted to certain special cases. Since the number of possible coupling structures is very large
4
and it keeps increasing as the number of output phases increases, the existing analysis does not
allow a designer to explore the whole design space and find the optimum design. Accordingly, a
new analytical framework that explains the operation of these oscillators is needed if the full
One thing that makes the MPRO design problem even more complicated is its property of
having multiple possible oscillation modes. Without a clear understanding of what makes one of
these modes dominant, it is very likely that a designer might end-up having an oscillator that can
start-up each time in a different oscillation mode depending on the initial state of the oscillator.
This is of course unacceptable in any practical application. This problem, which is characteristic
of multipath oscillators, is one that has been almost totally overlooked in all previous
publications.
for multipath ring oscillators. We explain the effect of the oscillator coupling structure on the
different performance metrics including the number of phases, oscillation frequency, phase
noise, and oscillation-mode stability. We use the presented model to formulate an optimization
algorithm that allows a designer to find the optimum coupling structure of an MPRO, needed to
While ring oscillators are very compact and can be tuned over a wide frequency range, they
are typically suitable for applications that require relatively less stringent phase noise
applications like RF and high-speed serial-link transceivers, better phase-noise performance than
can be provided by ring oscillators is needed. Accordingly, LC oscillators are typically used. To
5
generate multiple phases, it is possible in principle to use a multiphase LC oscillator. However,
due to the high cost of high-Q on-chip passive inductors in terms of chip area, it is very rare to
generate more than 4 phases using a multiphase LC oscillator. Alternatively, a multiphase clock
with high spectral purity can be generated from an LC oscillator using frequency-dividers as
shown in Fig. 1.2(b). This method has been shown to have less jitter as compared to other
approaches such as delay-locked loops [4] for the same power budget.
Frequency division at very high frequencies can be achieved efficiently at using injection-
locking [20-29]. While injection-locked LC oscillators can operate at higher frequencies than
ring oscillators, they suffer a narrow locking-range due to their high quality factor. This makes
injection-locked LC oscillators are not suitable for multiphase generation due to the need for
Injection-locked ring oscillators, on the other, hand have wider locking-ranges but they
oscillator has a maximum locking frequency that decreases with the number of phases at its
output. This means that same trade-off between phase resolution, operation frequency, and
power consumption still exists in frequency dividers realized as injection-locked ring oscillators.
superharmonic injection locked ring oscillators. Accordingly, the performance of these dividers
can be extended in the same way the performance of a ring oscillator can. We propose, for the
first time, the use of superharmonic injection-locked multipath ring oscillators (SHIL-MPROs) to
achieve frequency division at higher frequencies for the same number of output phases. The
6
theory we propose explains the different factors that limit the locking range of a SHIL-MPRO
and how it can be maximized for a certain number of output phases by choosing the optimum
coupling structure of the core MPRO and the input clocking configuration.
The dissertation consists of seven chapters. Chapter 2 gives the necessary background on
oscillators, MPROs, frequency dividers, and serializing transmitters. It begins with a brief
discussion for the important performance metrics of an oscillator followed with a review for the
different forms of MPROs that have been proposed in the literature before. The chapter then
reviews the different categories of multiphase frequency dividers. Finally, it gives a quick
overview of high-speed serial link transmitters and the factors that limit their operation range.
In Chapter 3, we present our analysis for MPROs. We derive accurate expressions for the
oscillation frequency and phase noise. We show that for the same power consumption, if the
oscillator is designed correctly, the number of phases can be increased without decreasing the
oscillation frequency or degrading the phase noise. We also explain how the oscillator should be
designed to oscillate at the desired oscillation mode independent on the initial conditions.
The analysis for MPROs is extended to dividers in Chapter 4. Theoretically, the maximum
division frequency of a frequency divider can be decoupled from the number of phases at its
output when multipath coupling is used. We derive accurate expressions for the locking range
and use them to find the optimum structure to maximize the division frequency. The results of
the model are incorporated into a design procedure that can rapidly explore various coupling and
7
We explain the details of the serializer design in Chapter 5 and discuss the measurement
results in Chapter 6. Finally, conclusions and future work are discussed in Chapter 7.
8
CHAPTER 2
Background
This chapter begins with a quick review of some basics of oscillators followed by the
evolution of multipath ring oscillators. We then give a brief summary of the different methods to
design a multiphase frequency divider. The definitions in this chapter are used in the next two
chapters where the analysis is explained in detail. Since a high-speed serial-link transmitter is
used an application to demonstrate the design of SHIL-MPROs, the last section in this chapter
introduces the components and design challenges of serial link transmitters. We explain the
timing constraints that limit its operating range and show how they can be relaxed using high-
speed multiphase frequency dividers. We show that if these dividers can be designed at a low
power cost, significant savings in power and complexity for the overall transmitter can be
achieved.
asymptotically stable limit cycle in its state space [30]. A more circuit-oriented definition is a
circuit that generates one or more periodic waveforms at its output. These waveforms can be
characterized by a certain frequency and amplitude and each of them has a certain phase.
If the harmonic content of the oscillator output can be neglected compared to its fundamental
where Ao is the oscillation amplitude, ωo is the angular frequency, and Φo is a fixed phase angle.
9
In practice, due to the various sources of noise and interference in a real circuit, both the
amplitude and the phase of the oscillator output suffer from small fluctuations around their
nominal average values. Accordingly, the oscillator output can be more practically expressed as
where a(t) and φn(t) represent the amplitude and phase fluctuations, respectively.
While small amplitude fluctuations can be tolerated in most practical applications, phase
fluctuations, commonly known as phase noise, are usually problematic. Fig. 2.1 shows the effect
of phase noise on the output of the oscillator in both the frequency and time domains. In the
frequency domain, random phase noise causes the output spectrum to broaden in shape compared
transceivers, this broadening is highly undesirable since it causes signal interference. In the time
domain, phase noise causes the zero-crossing instants of the oscillator output to deviate from
their ideal positions at integer multiples of the nominal oscillation-period. These deviations,
known as timing jitter, are highly undesirable in applications like high-speed digital circuits and
serial link transceivers since they reduce the timing margin where the system operates correctly.
Fig. 2.1. Spectral broadening and timing jitter due to phase noise.
10
Phase noise of an oscillator is measured in terms of the single-sideband phase noise spectrum
L(Δω). This is defined as the signal power in a 1-Hz bandwidth at a certain frequency offset, Δω,
from the nominal carrier frequency, ωo. The different mechanisms by which device noise is
converted into phase noise have been well studied by different researchers before [31-37]. In
general, for hundreds of Hz offset frequencies and above, it is fairly accurate to assume that
thermal noise causes phase noise L(Δω) that is proportional to (1/Δω)2 while flicker noise
produces phase noise skirts proportional to (1/Δω)3. Fig. 2.2 shows a typical single-sideband
Phase noise caused by random device noise can be reduced in a certain oscillator by scaling
the whole design so that it burns more power. Accordingly, if two different oscillator topologies
are to be compared in terms of their phase noise performance, the two oscillators must be
compared when scaled to burn the same power. In addition, oscillators running at higher
frequency are more sensitive to device noise due to their shorter oscillation periods. Accordingly,
to compare two different topologies, the oscillation frequency must also be the same for the two
oscillators.
11
To help compare the performance of different kinds of oscillators, a figure of merit (FOM) is
typically used that factors out the effect of power consumption, oscillation frequency, and offset
frequency where the phase noise is measured. This FOM , given by (2.2), will be used in the next
chapter to compare the performance of different ring oscillators in terms of phase noise.
o 2
FOM 10 log (2.2)
L P mW
Fig. 2.3 shows the schematic of a typical 5-stage inverter-based ring oscillator. For this
oscillator to operate correctly, the number of inverters in the loop and accordingly the number of
phases at the output has to be an odd number. The oscillation frequency of this oscillator is
inversely proportional to the number of stages. Accordingly, the time resolution of the oscillator
multipath ring oscillators have been introduced [10-19]. The simplest of these forms is the dual-
path ring oscillator from which three different forms have been proposed.
The first of these forms is the coupled oscillator proposed by Maneatis in 1993 [10]. In this
oscillator, the number of phases can be increased, according to Maneatis, without changing the
oscillation frequency. This is achieved by coupling a number of rings using a second path as
12
shown in Fig. 2.4 which causes the phases of the different rings to shift with respect to each
proposed by Lee in 1997 [11]. In this oscillator, the time resolution is improved by reducing the
delay per stage below 1 inverter-delay. This is achieved, as shown in Fig. 2.5, by driving each
output with an early signal, or a signal with a negative-delay, to accelerate the transition at the
output. The optimum skew has been derived analytically in [12] using a linear model and
verified by simulation.
Fig. 2.6 shows the third form of dual-path ring oscillators which uses cross-coupling to
achieve differential operation [13]. While the three proposed forms qualitatively seem different,
the three of them can be analyzed using one unified theory as will be shown in the next chapter.
13
Fig. 2.6. An 8-phase ring oscillator using cross-coupling to achieve differential operation.
The triple-path ring oscillator can be considered the second step in the evolution of multipath
ring oscillators. Fig. 2.7 shows an example where both delay-skew and cross-coupling are used
to reduce the delay per stage and achieve differential operation, simultaneously [14-16].
Fig. 2.7. An 8-phase triple-path ring oscillator using cross-coupling to achieve differential operation and
negative delay skew to reduce the delay per stage.
In general, we can add more and more paths per stage. As the total number of phases
increases, the degrees of freedom increase very rapidly so that it becomes very hard for a
designer to find the optimum configuration that meets his design targets just by qualitative
analysis. And moreover, it becomes very difficult to verify this configuration just by simulation.
As an extreme case, the oscillator used by Straayer in 2008 to provide a time base for a time-
to-digital converter is a 47-stage, penta-path ring oscillator [17]. Of course, it can be imagined
how hard it is to come up with the best coupling configuration for such an oscillator using only
number of stages as in SPROs but rather on the coupling structure of the oscillator as well. In
fact, if a designer tries to explore the design space by simulation, he will see that for some
coupling structures, as the sizing ratios are varied, the oscillation frequency can experience
discrete jumps accompanied by a change in the phase arrangement of the oscillator outputs. In
The reason for this phenomenon is that multipath oscillators have different possible modes of
oscillation. Each of these modes is characterized by a different frequency, phase shifts, and phase
noise versus power tradeoff. Without clear understanding of what determines the dominant
mode, it is very possible that a designer will end up having an oscillator where the oscillation
mode is only marginally stable; meaning that it depends on the oscillator initial conditions [17].
The problem becomes even more severe as the number of phases in the oscillator increases.
Factors that determine the dominant oscillation mode have been studied in [18][19]. However,
the analysis in [18] is constrained to dual-path oscillators while that in [19] is constrained to even
In the next chapter, we introduce a general model for multipath ring oscillators with arbitrary
coupling structures. We present a general analysis that explains the operation of a generic
multipath ring oscillator. We show that all the previously presented designs are special cases of
the general case we study. We explain how to design the oscillator for a certain number of phases
and maximize its oscillation frequency and phase noise performance while guaranteeing the
15
2.3. Multiphase Frequency Dividers
Frequency dividers are important building blocks in clock generators performing frequency
multiplication and in low-jitter multiphase generators. The most widely used type of frequency
dividers is the latch-based divider. These dividers are composed of latches or flip-flops as unit
building blocks. The latches can be of any logic family like CMOS, true single-phase clock
(TSPC), or current-mode logic (CML). To optimize the performance of these dividers, the only
degree of freedom is optimizing the latch design. In [38] for example, all possible
implementations of TSPC latches and flip-flops are explored using an exhaustive search to find
the optimum flip-flop implementation. In [39], a modified CML latch circuit is proposed that can
operate at speeds up to tens of GHz. This latch is used in [40] to design a 40 GHz divide-by-2
circuit. Inductive peaking can be viewed as a way of optimizing the latch design by partially
Regenerative frequency dividers (RFDs), also known as Miller-type dividers [20], and
injection-locked frequency dividers (ILFDs) have been used for frequency division at very high
frequencies. Although both terms are sometimes used interchangeably [21], the most common
convention is that regenerative dividers do not have a self-oscillation frequency in the absence of
the input signal. In other words, the self-oscillation frequency of regenerative dividers is equal to
zero. On the other hand, injection-locked dividers oscillate at a certain self-oscillation frequency
if the input signal is not applied. In [22], the authors show that both types of dividers can be
Both RFDs and ILFDs can be classified based on their core oscillator, injection method, and
injection phases. The core oscillator can be either an LC-based oscillator [21][23] or a ring-
16
oscillator [24][27]. LC-based oscillators can operate at relatively higher frequencies but they
suffer from a narrow locking range which makes them sensitive to PVT variations. In addition,
they consume large areas due to the use of inductors. Ring oscillators, on the other hand, have
lower frequencies but are more compact and have wider locking ranges reducing their sensitivity
to PVT variations.
Signal injection can be achieved by two different methods. One method is by varying the
transconductance of the amplifying transistors with the input signal. This is usually achieved by
driving the tail bias-current of these transistors with the input and thus is referred to as tail
injection [21][27]. Another method is by varying the load impedance, usually achieved by
modulating the conductance of a load that is directly connected to the divider output and thus is
called direct injection [23][24]. Fig. 2.8 shows an example of each of the 4 types of dividers
mentioned.
Dividers can also be classified based on the number of injection phases. Single-phase
injection means that only one phase is required at the input [25]. Similarly, the divider is said to
have a differential-injection when two opposite phases are used at the input [23][24]. In general,
multiphase injection can be used when M phases are used at the input (M=1, 2, 3, …) [26-28].
The idea of multiphase injection for frequency dividers has been introduced in [26]. The
authors use a frequency domain model to derive the optimum phase shift between different
injected signals to maximize the locking range. The analysis is applied to a 6-phase ring
oscillator composed of 3 CML differential buffers and with tail injection at two stages with 120o
phase shift. In [27], the authors explain the concept of multiphase injection using a time-domain
17
model. They apply it on a 4-phase divide-by-2 and a 12-phase divide-by-6 both composed of
Fig. 2.8. Different types of injection-locked frequency dividers classified based on the core oscillator and
the injection method.
The term, superharmonic injection-locked oscillator (SHIL-O), has been used in [29] to
can be locked in phase at steady state to a periodic input signal whose frequency is a
superharmonic, or an integer multiple, of its own output frequency. In that sense, both RFDs and
ILFDs can be considered SHIL-Os. The only difference between them is that RFDs use a core
18
We propose in Chapter 4 of this dissertation a general model for superharmonic injection-
locked multipath ring oscillators (SHIL-MPROs). The model is valid for both RFDs and ILFDs
with either tail-injection or direct-injection and with any number of input injection phases. In our
analysis, latch-based dividers are treated as a special case where the injection is large. We show
that SHIL-MPROs have different mechanisms that limit the locking range and derive the
mathematical expressions needed to optimize the locking range. We also derive a mathematical
condition that gives the possible clocking configurations for correct injection phases. With
multiple injection phases and multipath sizing factors, the design space of SHIL-MPROs can be
very large. A design procedure that uses the model is presented to allow designers to quickly
verification of the analysis. Fig. 2.9 shows a typical block diagram of a serializing transmitter [5-
9]. The transmitter consists mainly of a serializer, a driver, and a clock generator. The serializer
is a chain of multiplexers that combine a number of low-speed input data channels and produce a
single high-speed output data stream to be transmitted over a certain communication channel.
The driver converts the binary data into a signal format suitable for the channel. For the serializer
to operate properly, clock signals at different frequencies are required to drive the different
multiplexing stages. These clock signals are provided by a voltage-controlled oscillator and a
19
Fig. 2.9. Conceptual block-diagram of a serializing transmitter.
The range of output data rates where the transmitter can operate properly is limited by several
factors. These factors include the circuit bandwidths, the oscillator tuning range, the locking
range of the frequency dividers, and the timing constraints of the different serializer stages.
For multi-Gb/s transmitters, LC oscillators with a combination of coarse and fine tuning
capacitors are typically used to generate the required high-speed clock at low phase noise and
wide enough tuning range. To meet the speed requirements, designers usually use inductive
peaking to extend the circuits bandwidths at the cost of the large area consumed by the inductors
[42-44]. This bandwidth extension is usually applied to the final multiplexing stages in the data
To explain how the timing constraints of a certain multiplexing stage can limit the operation
range of the transmitter, Fig. 2.10(a) shows an N-to-1 multiplexer driven by an N-phase input
clock signal of frequency fo/N where fo is the bit rate at the multiplexer output. This clock signal
also samples, and if necessary, retimes, the N outputs of the preceding stage. Fig. 2.10(b) is a
timing diagram that describes the operation. The signal D<1> is an output of the preceding
multiplexer. Transitions in D<1> occur after a delay t(C-Q)MUX from the edge of the divided clock
20
signal CKDIV<1>. This divided clock signal in turn has transitions that are delayed by tdiv from
the edge of the input clock signal CKin<1>. The signal D<1> is sampled by CKin<i> which is
phase-shifted from CKin<1> by (2π)θ. In Fig. 2.10(a), θ can only take discrete values that are
multiples of 1/N. In general, if a buffering stage or variable delay is used between the input clock
and the N-to-1 MUX, θ can be any positive real number. Depending on the number of phases at
the output of the divider and the number of latches used in the retiming flip-flop, the same data
phase can be sampled by more than one input clock phase. We assume the phase difference
between the first and last clock edges sampling the same data phase is equal to (2π)φ.
(a)
(b)
Fig. 2.10. Timing requirements in an N-to-1 multiplexer: (a) Block diagram. (b) Timing diagram.
21
For the setup time condition to be met, the delay between the triggering edge CKin1 and the
first sampling edge CKini has to be long enough to allow the data to switch before the edge of
CKini by at least one setup-time (tsu). This delay sets an upper limit on the output bit rate as
given by (2.3). On the other hand, for the hold time constraint to be met, the delay between the
triggering edge CKin1 and the last sampling edge CKinj has to be short enough to allow at
least one hold-time (th) before the next data transition. Assuming th<t(C−Q)MUX+tdiv, the hold time
condition sets a lower limit on the allowed output bit rate as given by (2.4).
N
f max (2.3)
tdiv tC Q MUX tsu
N 1
f min (2.4)
tdiv tC Q MUX th
Fig. 2.11 shows the effect varying the phase shift θ and the order of the multiplexer on the
valid bit rate window. In the figure, we assume φ=0 which can be achieved by retiming the
sampled data before applying it to the multiplexer. As θ increases for a certain multiplexer, the
maximum allowed bit rate increases but the overall window of allowed rates becomes narrower.
The graph shows that a tradeoff exists between maximizing the output bit rate of the transmitter
and its robustness where the narrow window can shift with PVT variations. When the order of
the multiplexer (N) increases, this tradeoff between data rate and robustness becomes more
22
Fig. 2.11. Allowed bit rate window as a function of θ assuming φ = 0.
One technique for dealing with this delay skew problem is to insert a replica delay buffer
between the input clock and the N:1 MUX [45]. Ideally, if the buffer delay matches the total
delay of the divider and the preceding MUX, the timing conditions can be met independent on
the bit rate. However, the replica buffer consumes extra power and occupies an additional area.
Besides, it is usually hard to achieve good matching over all PVT corners [8].
Another technique for dealing with the delay skew problem is the insertion of a variable delay
or a phase interpolator between the divided clock and the preceding MUX [7-8]. A feedback loop
adjusts the value of θ so as to position the valid data rate window around the desired operation
data rate. The drawback of this technique is the added design complexity in addition to the power
consumption and area overhead. For example, the power overhead in [8] is 72mW which is
slightly less than the total power consumption of our whole transmitter in the same 65nm CMOS
technology. The area overhead is 0.88mm2 which is 73% of our total chip area.
Equations (2.3-2.4) and Fig. 2.11 show that as the order of the multiplexer (N) increases, the
timing constrains become more relaxed due to the increased bit period at the input. In addition,
as more phases are used for the input clock, the phase shift θ can be selected at design time with
23
a higher resolution allowing the optimum positioning of the nominal bit rate window such that
the desired bit rates can fall within this window for all PVT corners.
Since the timing conditions have to be satisfied in every stage in the serializer, it is
advantageous if the divided clocks also have multiple phases. This is because this allows more
flexibility in setting the maximum and minimum allowed data rates for all stages. In Chapter 5,
we show how multiple phases in the divided clock signals can also be useful in reducing the
retiming elements to a minimum which leads to significant power and area savings.
24
CHAPTER 3
This chapter presents the analysis of multipath ring oscillators. We first explain a general
model that can represent any multipath ring oscillator. Then, we derive expressions for the
oscillation frequency and amplitude and show how the frequency can be maximized taking mode
stability into consideration. After that, we present the phase noise analysis.
Fig. 3.1 illustrates the general form of an MPRO. The oscillator consists of N stages where
each stage consists of N inverting buffers connected together at their outputs. The input of each
buffer is connected to one of the N different nodes of the oscillator to cover all possible coupling
paths. The buffers of the main loop are all scaled by a sizing factor hN with respect to a unit
reference buffer of a given size. Buffers that are driven by signals skewed by one stage are sized
by a factor hN-1. Similarly, buffers driven by signals skewed by j stages are sized by hN-j.
sizing factors, hi. For example, an ordinary SPRO is a special case of the general form where
hi=0 iN. A dual-path ring oscillator has hi=0 i{N,N-j} where j is the skew of the second
input and 0<j<N. Cross coupling, usually used to achieve differential operation, can be modeled
for an even numbered N by h(N/2)+10. The coupled oscillator [10] can also be treated as a special
case. By using a 53 coupled oscillator as shown in Fig. 3.2(a) as an example, when numbering
the nodes diagonally as shown in the figure, the oscillator can be visualized as a 15-stage dual-
(a)
(b)
Fig. 3.2. An example using a 5×3 coupled-oscillator: (a) Coupled oscillator with its nodes numbered
diagonally. The coupling inverters are scaled by h15 while the ring inverters are scaled by h13. (b) Equivalent
dual-path ring oscillator with a skew of 2.
26
Fig. 3.3 shows the circuit schematic of the unit inverter used in all the simulations in this
chapter. All simulations are performed using a 65nm CMOS technology with a supply voltage of
1V. Transistors M1 and M4 are used as enable devices where their gates are always driven by
VDD for M1 and GND for M4. Transistors M2 and M3 are the main inverter devices. They have
dimensions of (4m/L) for M2 and (8m/L) for M3 as shown in the figure. In different
simulations, L takes values of 60nm, 100nm, 200nm, or 400nm. The driving strength of each
inverter is determined by the number of units used which is indicated by hi in Fig. 3.3. Unless
otherwise stated, all the simulation results shown in this chapter are obtained using the periodic
steady state (PSS) and the periodic noise (PNoise) simulators in Cadence-SpectreRF.
Fig. 3.4(a) shows the inverting buffer modeled as a linear transconductor. The input-output
relationship of a single buffer scaled by hi and driving the input capacitance of a similar buffer
can be expressed as
27
dvout t
hi g m vin t hi g o vout t hi C g 0
dt
(3.1)
dvout t
an vin t vout t 0
dt
where gm is the transconductance, go is the output conductance, Cg is the buffer input capacitance
which also acts as the load capacitance for the driving buffer, an=gm/go is the linear dc gain of the
buffer and =Cg/go is a time constant. In practice, parasitic capacitance due to routing and any
(a) (b)
Fig. 3.4. Model of a single unit inverting buffer (a) Linear model (b) Nonlinear model.
A single stage of an MPRO consists of N buffers and is loaded with N buffers of similar
sizing factors as shown in Fig. 3.1. Using the linear model, the output voltage of any stage can be
expressed in terms of the N input voltages using a linear differential equation. Equation (3.2)
expresses v1, the output of the first stage in Fig. 3.1, in terms of the N inputs of the first stage.
N N N
dv1 t
h g v t h g v t h C
i 1
i m i
i 1
i o 1
i 1
i g
dt
0 (3.2)
N
Defining the fractional sizing factors and the total sizing factor as xi=hi /H and H hi ,
i 1
respectively, equation (3.2) can be simplified into (3.3) where an=gm/go and =Cg/go are the same
To model the nonlinear behavior of the inverter, we assume the transconductance and the
output conductance to be nonlinear functions of the input and the output voltages, respectively.
This nonlinear model is shown in Fig. 3.4(b). In this model, we assume the inverter input
capacitance is linear. The output voltage of a stage can be expressed as a function of the N inputs
using a nonlinear differential equation. Equation (3.4) shows the output of the first stage of the
general MPRO in Fig. 3.1 as a function of the N inputs. When Fg(vi)=gmvi and Fo(v1)=gov1, this
N
dv1 t
x F v t F v t C
i 1
i g i o 1 g
dt
0 (3.4)
In this section, we first use linear analysis to explain the different oscillation modes that can
exist in an MPRO and determine which is dominant. Linear analysis also helps in understanding
how we can size the different coupling paths of the MPRO and what limits its maximum
oscillation frequency. Next, we use a simplified nonlinear analysis to find an expression for the
oscillation amplitude and a more accurate expression for the oscillation frequency.
Due to the symmetry of the system in Fig. 3.1 and in the absence of mismatches, if the MPRO
is to oscillate at a certain frequency, n, the phase shift between each two successive nodes and
29
the amplitude of oscillation at each node in the oscillator will ideally be the same. Therefore, the
where n is the oscillation frequency and Δφ=2πn/N since the total phase shift around the loop
should be multiples of 2, N is the number of stages in the oscillator, and n can take values
N
2 n 2 n 2 n
an xi cos n t i cos n t n sin n t 0 (3.6)
i 1 N N N
By equating the cos(nt) and the sin(nt) terms of (3.6), we get expressions for the oscillation
frequency of the nth mode (3.7) and the minimum dc gain required for this mode to exist (3.8).
2 n
i 1
N
x sin
i
N
n i N1 (3.7)
2 n
xi cos i 1
i 1 N
1
an
2 n
i 1
N
(3.8)
xi cos
i 1 N
In practice, the oscillator starts first from a linear mode of operation where all the buffers are
indeed acting as linear transconductors. All oscillation modes that have mode gains, an, lower
than the actual dc gain of the inverter, ao, start to grow. As the oscillation amplitude grows, the
effective gain of the inverter drops due to nonlinearity. Consequently, modes with higher mode
30
gain die out and only the mode that requires the minimum gain continues to oscillate and hence
is the dominant mode. Therefore, for an MPRO with N stages, once we choose a certain value for
the relative sizing vector x =[x1 x2 x3 … xN]T N, the index, n*, of the dominant mode is given
by (3.9). Note that this definition for the dominant mode is dependent only on the relative sizing
vector, x , of the MPRO and is valid for any buffer design used to build the actual oscillator.
1
n* arg min N 0
2 n
n 1,2,..., N
xi cos i 1 (3.9)
i 1 N
N 2 n
arg max xi cos i 1
n 1,2,..., N i 1 N
The nonlinear nature of the buffer elements in the MPRO affects the oscillator in two ways.
First, the nonlinearity determines the oscillation amplitude. Second, it can change the oscillation
frequency since the effective output conductance is often dependent on the oscillation amplitude.
Equation (3.7) predicts the oscillation frequency normalized to a certain time constant.
Depending on the unit inverting buffer used to implement the MPRO, this time constant may not
be actually, constant. To explore the dependence of the time constant, , and the oscillation
amplitude ,Vo, on the sizing vector, x , we simplify equation (3.4) by solving it using a single
tone harmonic balance (3.10). This assumption is later verified to be a good approximation when
N
2 n 2 n 2 n
I g Vo xi cos n t i I o Vo cos n t n C gVo sin n t 0 (3.10)
i 1 N N N
31
In (3.10), Vo is the amplitude of the fundamental component of the voltage waveform, Ig(Vo)
and Io(Vo) are the fundamental components of the transconductor and the output conductance
currents, respectively. Similar to the linear analysis, equation (3.10) can be decomposed into the
following:
2 n
i 1 I o Vo 0
N
I g Vo xi cos (3.11)
i 1 N
2 n
i 1 n C gVo 0
N
I g Vo xi sin (3.12)
i 1 N
Using the frequency and mode gain definitions in (3.7) and (3.8), we can reduce (3.11) and
N
2 n*
CV x sin
i i 1
n* g o i N1 N
(3.14)
I o Vo x cos 2 n i 1
*
i
i 1 N
Equation (3.13) shows that the oscillation amplitude of an MPRO depends only on the type of
nonlinearity that the buffer exhibits and the value of the dominant-mode gain. Fig. 3.5 is an
interpretation for (3.13). The figure shows arbitrary amplitude characteristics for an inverter. At
small input amplitudes, the inverter acts as a weakly nonlinear amplifier that can be
approximated by a third order nonlinearity coefficient. As the input amplitude increases, the
output resembles a square-wave whose fundamental component is (2/π)Vdd. As the figure shows,
the oscillation amplitude is determined by the amplitude that causes enough gain compression
such that the effective dc gain of the inverter is exactly equal to the gain required to sustain the
To find an accurate expression for the oscillation amplitude, the exact shape of the buffer
nonlinearity must be known. In general, two limiting cases can be examined. The first case is
when the mode gain is close to unity where the oscillation waveform is approximately a square-
wave with a fundamental component (2/π)Vdd. In this case, the oscillation amplitude can be
approximately given by (Vo/Vdd)= (2/π)(1/an*). The other limiting case is when the mode gain is
close to the small signal dc gain of the inverter, ao, which is the maximum gain an inverter can
provide. This case causes the oscillation amplitude to be very small and the inverter nonlinearity
can be approximated using a third-order nonlinearity coefficient. This coefficient can be chosen
Equation (3.15) is an empirical expression for the oscillation amplitude. This expression is
derived by taking the average of the two limiting cases discussed above. The third order
nonlinearity coefficient is empirically chosen such that Vo is equal to (2/π)Vdd when an*=1.
Vo 1 1 a a
o n* (3.15)
Vdd an* ao 1
33
Fig. 3.6 shows the simulated oscillation amplitude as a function of the dominant mode gain
together with equation (3.15). The values of the dc gain, ao, used to plot equation (3.15) for
different channel lengths are shown in Table 3.1. These values are obtained using ac simulation
and will be used throughout the rest of this chapter. For each value of the mode gain, we use an
MPRO similar to that in Fig. 3.1 with the unit inverter shown in Fig. 3.3 and with sizing factors
as shown in Table 3.2. The way these sizing factors are derived is explained in the next
subsection. To verify the validity of the expression for different channel lengths, each simulation
is repeated using different channel lengths for the NMOS and the PMOS transistors of the
inverter. As the figure shows, very good agreement between the empirical equation and the
600 Theory
Vo (mV)
400 Simulation
200
0
0 2 4 6 8 10
an*
(a)
600 Theory
Vo (mV)
400 Simulation
200
0
0 4 8 12 16 20
an*
(b)
Fig. 3.6. Fundamental component of the oscillation waveform as a function of the dominant mode gain.
(a) L=60nm (b) L=200nm.
34
Table 3.1. Small-signal dc-gain and time-constant parameters of the unit inverter for different channel
lengths
Channel Length (L) 60nm 100nm 200nm 400nm
To find an expression for the nonlinear time constant, we compare equations (3.14) and (3.7)
to derive (3.16), the effective time constant. Equation (3.16) shows that the effective time
constant is similar to the oscillation amplitude in that it only depends on the dominant mode gain
and the buffer nonlinearity. This result implies that two MPROs with the same dominant mode
gain but with different sizing factors and possibly, different number of stages would still have the
CgVo
an* (3.16)
I o Vo
As we did for the oscillation amplitude, we first examine the two limiting cases when an*1
and when an* ao, the small signal dc gain of the buffer. These two cases represent the
minimum and the maximum dominant mode gains that can exist. Obviously, a mode gain below
unity would not satisfy the oscillation unity gain condition and an oscillator with a dominant
For the case when an*1, the equivalent time constant (an*) approaches p, the buffer’s
fanout-1 propagation delay. To explain this, we examine equations (3.7) and (3.8) with xN=1 and
N a large odd number. In this case, we see that an*1 and fn*1/(2N) which is the familiar
equation for the oscillation frequency of a long-chain SPRO having a delay per stage, p.
35
When an*ao, the oscillation amplitude is infinitesimally small since any growth in amplitude
will cause the gain to drop due to nonlinearity. Accordingly, the linear expression for the
oscillation frequency becomes valid and the equivalent time constant is equal to the small signal
Fig. 3.7 shows the simulated values of the effective time constant as a function of the mode
gain for different channel lengths of the inverter. The plot is obtained by simulating different
MPROs having the sizing factors shown in Table 3.2. For each MPRO, we compare the
oscillation frequency obtained from simulation, fsim, to the expression in (3.7) and find the value
N
2 n
1 x sin
i
N
i 1
sim i N1 (3.17)
2 f sim 2 n
xi cos i 1
i 1 N
40
30
[ps]
20
Theory: Eq.(18)
10
Simulation: Eq.(17)
0
0 2 4 6 8 10
an*
(a)
400
300
[ps]
200
100 Theory: Eq.(18)
Simulation: Eq.(17)
0
0 4 8 12 16 20
an*
(b)
Fig. 3.7. Effective time constant as a function of the dominant mode gain. (a) L=60nm (b) L=200nm.
36
As the figure shows, the effective time constant can be very well approximated using a linear
interpolation between the two extreme cases of mode gain. Equation (3.18) shows the
approximate expression for the effective time constant as a function of the dominant mode gain.
The validity of this approximation is verified by repeating the simulation for different channel
lengths as shown in Fig. 3.7. In all cases, we found a very good agreement between the linear
interpolation and the simulated values of the time constant. The values for ao, τo and τp used in
Fig. 3.7 and throughout the rest of this chapter are shown in Table 3.1. These values were
obtained using ac simulation for ao. Values of τp are the fanout-1 propagation delay of the unit
cell obtained by simulating a 19-stage SPRO. Finally, values of τo are those that best fit the
curves in Fig. 3.7. These values were found to be very close to the actual small-signal time
o p
a n* p an* 1 (3.18)
ao 1
Using equations (3.7), (3.8), and (3.18), an expression for the oscillation frequency can be
derived (3.19). Equation (3.19) gives the oscillation frequency of the dominant mode of any
N 2 n* i 1
ao 1 xi sin
1 i 1 N
f n* (3.19)
2 N 2 n* i 1
ao p o xi cos N o p
i 1
To verify this general expression for the MPRO frequency, the frequency is calculated for
different oscillator configurations and compared against circuit simulation results. We start with
37
the degenerate case of an SPRO. In this case, xN=1 and all other fractional sizing factors are
equal to zero. Equation (3.9) shows that the dominant oscillation mode for SPROs is given by
n*=ceil(N/2). For even N, n*=N/2 and the oscillation frequency of the dominant mode is always
equal to zero indicating the fact that even numbered SPROs latch and do not oscillate. For odd N,
n*=(N+1)/2 showing that the phase shift between every two successive nodes is π+(π/N) which is
a well-known result for SPROs. The oscillation frequency can be found from (3.19) and is
expressed by (3.20) and compared to simulation in Fig. 3.8. The expression reduces to the
24
Theory
18
freq [GHz]
Simulation
12
6
0
3 7 11 15 19
N
(a)
6
Theory
4.5
freq [GHz]
Simulation
3
1.5
0
3 7 11 15 19
N
(b)
Fig. 3.8. Oscillation frequency of an SPRO for different numbers of stages. (a) L=60nm (b) L=200nm
Fig. 3.9 shows the comparison for the case of a 3-stage dual-path ring oscillator. Fig. 3.9(a)
shows the schematic diagram of the oscillator. In this special case, the mode gain is equal to 2
and is independent on the sizing coefficients. Accordingly, the effective time constant is also
independent on sizing. The oscillation frequency can be obtained from (3.19) and is plotted in
38
Fig. 3.9(b). The analytical results were found to match the simulation for all the simulated
channel lengths. The figure only shows the results for a channel length of 200nm.
(a)
4
3
freq [GHz]
2
Theory
1 Simulation
0
0 2 4 6 8 10
h3 / h2
(b)
Fig. 3.10 and Fig. 3.11 show the comparison between equation (3.19) and the simulation
results for two different 8-stage dual-path ring oscillators. The oscillator in Fig. 3.10(a) has a
skew of 5 stages meaning that v1 is driven by v8 and v3. Similarly, the oscillator in Fig. 3.11(a)
has a skew of 3 stages. Fig. 3.10(b) and Fig. 3.11(b) show the simulation results and the
analytical expression for the oscillation frequency. The figures also mark the different regions
where different modes are dominant. These regions are found using equation (3.9). The
accompanied by a change in the output phase order when the sizing factors are changed. For this
reason, if the MPRO is to be used as an oscillator that is desired to operate at a stable, well-
39
In both figures, a discrepancy between the simulation and the analytical prediction at the
boundary between the modes n*=5 and n*=4 is observed. This discrepancy is due to the very
small difference in mode gains at this region and the way these oscillators are simulated. This
small difference in mode gain causes the possibility of exciting a non-dominant mode if the
initial conditions of the oscillator are suitable for this. More about mode stability and dependence
(a)
(b)
Fig. 3.10. Oscillation frequency of an 8-stage dual-path ring oscillator with a skew of 5 stages. (a)
Schematic diagram of the oscillator (b) Simulation results (black dots) and analytical expression (gray curve)
for L=200nm.
40
(a)
(b)
Fig. 3.11. Oscillation frequency of an 8-stage dual-path ring oscillator with a skew of 3 stages. (a)
Schematic diagram of the oscillator (b) Simulation results (black dots) and analytical expression (gray curve)
for L=200nm.
Using the derived analytical expression for oscillation frequency as a function of the
fractional sizing vector, x , we can find the value of x that maximizes the oscillation frequency
for a given number of stages and certain constraints on the mode gain. This maximization is
next section, the design for maximum frequency allows us to design oscillators with lower phase
noise.
We first define the two matrices A and B as given by (3.21) and (3.22) and then rewrite (3.7)
41
01 0N
2 n
A ; ni sin i 1 (3.21)
N
N 11
N 1 N
b01 b0 N
2 n
B ; bni cos i 1 (3.22)
b N
N 1 1
b
N 1 N
1
an (3.23)
BnT x
AnT x
n (3.24)
BnT x
where x =[x1 x2 x3 … xN]T N is the fractional sizing vector and each of the row vectors AnT
The goal is to maximize n* where the feasible region of the vector x in N can be
expressed by a group of linear equations and linear inequalities. This region is defined by
x N : x 0, 1T x 1 (3.25)
where 1 T = [1 1 1 … 1]T. The first constraint requires x being positive while the second
The algorithm used to solve this optimization problem can be summarized in the following
steps:
1. For each mode n, where n and N do not have common factors (to guarantee N
distinctive phases at the output), find the region of space n where this mode is
dominant. This can be achieved by adding two additional constraints to the feasible
42
region as shown in (3.26). The first inequality in (3.26) constrains the mode gain to
be less than a certain maximum value, amax. This is important to guarantee certain
minimum oscillation amplitude. The second inequality guarantees the mode gain of the
mode n is less than that of all other modes. This allows us to maximize the frequency
of each mode in the region where this mode is dominant. The parameter m allows us to
set a certain margin between the dominant mode gain and other modes to guarantee
2. Find the maximum normalized oscillation frequency of each mode and the
problem:
iteration using a binary search algorithm. Each of the iterations in this algorithm is a
linear programming feasibility problem that can be solved in MATLAB by methods such
3. Find the absolute maximum frequency over all modes and the corresponding optimum
sizing vector, x * .
4. Repeat the previous steps for MPRO’s with different numbers of stages.
43
Fig. 3.12 shows the results obtained from running the above algorithm assuming amax is
infinite and m= 0. These results represent the absolute maximum normalized frequency that can
be obtained from an MPRO of a certain number of stages. The results show a linear increase in
the maximum possible normalized oscillation frequency as the number of stages increases
provided that the dc gain of the buffer sufficient to provide the required amplification. Table 3.2
shows optimum values of the sizing factors, hi, and the corresponding mode gain for varying
number of stages. Note that in this optimization and subsequent derivations, x1 (an accordingly,
h1) is set to 0 since a diode connected transconductor mainly reduces the effective value of the
12
(2/)*N
Optimization Output
( n )max
0
0 4 8 12 16 20
N
It is interesting to note that the results in Table 3.2 are considerably systematic and can be
easily deduced for higher values of N. Using (3.8) and the values in Table 3.2, the dominant
mode gain for maximum frequency is given by an*=N–1 as can be seen in Table 3.2. From (3.7),
Since the results in Fig. 3.12 and Table 3.2 were derived assuming m= 0, these results cause
the oscillator to oscillate at the boundary between different oscillation modes. In fact, it can be
easily shown by substituting the values in Table 3.2 into (3.8) that for each of these MPROs, the
44
value of the mode gain is the same for all modes n= 1, 2, …, N-1. This means that the maximum
oscillation frequency of the MPRO is achieved at the cost of the minimum possible mode
stability. Accordingly, if the MPRO is to be used as an oscillator that has a stable well-defined
oscillation mode, some back-off from this maximum frequency is needed to guarantee mode
stability. However, the results obtained for the maximum frequency and the corresponding sizing
factors are still important for two reasons. First, they serve as a benchmark to show the
maximum achievable frequency under all circumstances. And second, they can still be used for
driven MPROs like in injection-locked oscillators and frequency dividers where mode stability is
Table 3.2. Optimum sizing factors for maximum oscillation frequency assuming unlimited dc gain and zero
mode gain margins
N an h2 h3 h4 h5 h6 h7 h8 h9 h10 h11 h12 h13 h14 h15 h16
3 2 0 1
4 3 0 1 2
5 4 0 0 1 1
6 5 0 0 1 2 2
7 6 0 0 0 1 1 1
8 7 0 0 0 1 2 2 2
9 8 0 0 0 0 1 1 1 1
10 9 0 0 0 0 1 2 2 2 2
11 10 0 0 0 0 0 1 1 1 1 1
12 11 0 0 0 0 0 1 2 2 2 2 2
13 12 0 0 0 0 0 0 1 1 1 1 1 1
14 13 0 0 0 0 0 0 1 2 2 2 2 2 2
15 14 0 0 0 0 0 0 0 1 1 1 1 1 1 1
16 15 0 0 0 0 0 0 0 1 2 2 2 2 2 2 2
When the mode gain is limited by a certain finite amax, the optimum values of hi will be
different from those shown in the Table 3.2 for N>(1+amax). The new values can be obtained
similarly by running the same algorithm as before and setting the appropriate value of the
maximum gain. In this case, the maximum normalized oscillation frequency saturates
approximately at (2/)(amax+1) where amax is the upper bound of the mode gain used in the
45
optimization problem. In general, if we use the upper bound shown in Fig. 3.12 for (nτ)max, that
is (nτ)max < (2/π)N, together with (3.18), an approximate maximum limit on the oscillation
1 N ao 1
2 ; 3 N amax 1
ao 1 p N 2 o p
f max N (3.28)
1 amax 1 ao 1 ; N amax 1
ao 1 p amax 1 o p
2
From this expression, we see that theoretically, the absolute maximum frequency of an MPRO
can be achieved when amax is set equal to ao, the maximum available dc gain of the inverter.
When this gain is high enough, the limit is equal to (1/π2)ao/τo= (1/π2)gm/Cg which is
proportional to the cut-off frequency of the unit inverter used to build the oscillator.
Fig. 3.13 shows the actual maximum oscillation frequency obtained from circuit simulation
together with the theoretical expression in (3.19) and the maximum limit in (3.28). The increase
in the actual frequency is only sub-linear as the figure shows due to the increase of the effective
time constant with mode gain as explained in the previous subsection. As the figure shows, as N
increases, both equations (3.19) and (3.28) converge and they match the simulation results very
closely. In these simulations, the initial state of the oscillator is adjusted such that the desired
oscillation mode is excited. It should be noticed that for N=3, the maximum frequency is that of a
46
15
freqmax [GHz]
13
Simulation
11 Equ. (19)
Equ. (28)
9
0 4 8 12 16 20
N
(a)
6
freqmax [GHz]
5
Simulation
4 Equ. (19)
Equ. (28)
3
0 4 8 12 16 20
N
(b)
Fig. 3.13. Simulated maximum oscillation frequency for different numbers of stages.
(a) L=100nm (b) L=200nm.
A common problem in MPRO design is the stability of the dominant oscillation mode. Mode
stability refers to whether the MPRO always oscillates at the same mode regardless of the initial
conditions of the oscillator. This problem is especially pronounced for MPROs with a large
number of phases. This is due to the existence of many modes and the very small differences in
the value of the mode gain of adjacent modes if the MPRO is not well designed.
Analyzing the problem of mode stability accurately requires an accurate nonlinear state-space
analysis to identify the different regions of attraction for each of the oscillation modes in the N-
dimensional state space of the oscillator [30]. Such accurate analysis is out of the scope of this
work and cannot be made using the presented model. However, our model still provides
47
sufficient information for designing MPROs that are less sensitive to their initial conditions and
that have a stable, well-defined oscillation mode regardless of their initial state.
As an example, we use the two 8-stage dual-path ring oscillators shown in Fig. 3.10 and Fig.
3.11 with coupling skew of 5 and 3, respectively. Fig. 3.14 shows the mode gain of the different
possible modes of oscillation for the two oscillators as a function of the sizing factors. These
curves can be obtained using equation (3.8) for different values of n. Modes that are not plotted
in the figure are either degenerate or impossible (having negative mode gain). As the figure
shows, for both oscillators, the difference in the values of mode gain for n=4 and n=5 is very
small near the boundary between these two modes. In general, when the mode gain difference
between two modes is small, the oscillator can operate in either one depending on initial
conditions. Hence, designing an oscillator near this boundary is undesirable although the
Another example is shown in Fig. 3.15. The figure shows the mode-gain profile of two 47-
stage MPROs. The first MPRO is designed using the design procedure explained in the previous
section for maximum oscillation frequency at a maximum mode gain amax=2. The oscillator has
sizing factors, (h47,h3)=(2,1), a dominant mode, n*=31, a dominant mode gain, an*=2, and a
normalized oscillation frequency, n*τ=1.73. Fig. 3.15(a) plots the mode gain profile of this
oscillator. The even symmetry in the figure directly follows from (3.8). Combined with the odd
symmetry in equation (3.7), every two mirror image modes are actually degenerate, i.e. they are
the same mode. As the figure shows, the mode gain margin between the dominant mode, n=31,
and the adjacent modes is very small. This causes the actual oscillation mode to be sensitive to
initial conditions.
48
(a)
(b)
Fig. 3.14. Mode gain of the different possible oscillation modes of an 8-stage dual-path ring oscillator
having (a) Skew=5 (b) Skew=3. The vertical dashed lines represent boundaries between different modes.
Fig. 15(b) shows the mode gain profile of a second MPRO. The sizing factors of the second
oscillator are also obtained using the optimization algorithm with the mode gain margin, m, set
to 50% to allow for mode stability. After quantizing the sizing factors into integers, the second
mode, n*=30, a dominant mode gain, an*=1.84, and a normalized frequency, n*τ=1.48. As Fig
15(b) shows, all the modes other than the dominant mode have at least 50% higher mode gain
49
(a)
(b)
Fig. 3.15. Mode gain profile of a 47-stage MPRO having a coupling configuration.
(a) (h47,h3)=(2,1) (b) (h47,h25,h22,h14,h11)=(3,5,4,5,2).
To verify the two MPROs for mode stability, both oscillators are simulated using the test-
bench shown in Fig. 3.16. In this simulation, an exponentially decaying sinusoidal waveform is
coupled to one node of the MPRO using a small coupling capacitor of 1fF. The coupling
capacitor is small enough not to cause any significant loading or mismatch for the MPRO when
oscillating at steady state. The excitation frequency, fs, is varied and the steady-state oscillation
frequency and oscillation mode are measured. For the first oscillator in Fig. 3.15(a), the final
oscillation mode was found to be dependent on the excitation frequency. This is attributed to the
small mode gain margin as explained before. For the second oscillator, we found the final
50
Fig. 3.16. MPRO simulation test-bench. The oscillator is excited using an exponentially decaying
sinusoidal waveform of frequency fs coupled to v1 using a 1-fF capacitor.
In general, the design methodology presented in this section can be used to design any MPRO
with any number of phases. The designer must verify the mode stability by simulating the
oscillator using the different expected initial conditions. Improving the mode stability can be
achieved by increasing the mode gain margin and this usually comes at the cost of reduced
maximum frequency.
The approach for analyzing phase noise follows a linear analysis similar to that in [31]. The
MPRO is modeled as an unstable negative feedback system with the noise sources in the system
acting as inputs. Fig. 3.17(a) shows the small signal model of the MPRO. Every single
transconductor injects noise into the oscillator by an equivalent noise current source, hi in . To
2
find the system transfer function, the loop can be opened at node v1 as shown in Fig. 3.17(b)
where all the noise sources injecting noise current at v1 are lumped into one noise source, H in
2
where H is the total sizing factor. The system is equivalent to the circuit in Fig. 3.17(c) where the
output current from the equivalent transadmittance across v1 is subtracted from the input noise
51
current. The error current is converted into a noise voltage by the impedance Z(j)/H seen at
Based on this model, and using Fig. 3.17(d), the open–loop gain, G(j), and the closed–loop
transfer function, T(j), of the MPRO can be expressed in equations (3.29) and (3.30)
respectively.
1
G j Z j H Ym j
H (3.29)
Z j Ym j
vn 1 Z j
T j j (3.30)
iin H 1 G j
At the oscillation frequency, n, linear analysis predicts an infinite closed-loop transfer
function and an open-loop gain, G(jn) = –1. Using the first order Taylor series approximation
1 Z jn
T jn j
H d (3.31)
G j
d n
The noise voltage at the node v1 due to the noise current injected at this node can be expressed
as
2
___
___
i2
Z jn
vn2 jn j n
(3.32)
H d
G j
d n
52
(a)
(b)
(c) (d)
Fig. 3.17. Linear model used for noise analysis. (a) Schematic diagram for the MPRO in closed loop. (b)
Schematic of the MPRO in open loop configuration. (c) Equivalent circuit of the MPRO. (d) Block diagram
of the equivalent negative feedback system.
53
By using assumptions similar to [31]1, the expression for the single-sideband phase noise due
___
2
1 v
L Nn
2
2 V rms
2
(3.33)
Z jn r
2
i r N
2 2
n
n
Vo2 H
n
d
G j
d n
To find an expression for the oscillator figure of merit, the normalized power consumption,
Pu, is defined as power consumed in mW by a single stage in the MPRO having a total sizing
factor, H=1. With this definition, the total power consumption of the MPRO can be expressed as
Ptot , mW N H Pu (3.34)
Combining (3.33) and (3.34), the figure of merit of the MPRO is expressed in equation (3.35)
where Io(Vo) is the fundamental component of the output conductance current as in (3.10) and the
power is in mW.
1
L Ptot , mW
FOM
n
2
d
2
n G j
Vo 2
1 d n
(3.35)
n
i 2
r 2 P
u N Z j n r
I 2 1
o FMPRO x
in Pu
2
1
Assume that half of the power is causing phase noise while the other half is causing amplitude noise that is rejected by the nonlinear
amplitude limiting mechanism of the oscillator. Also assume that the voltage gain between different nodes of the MPRO close to the oscillation
frequency is approximately unity.
54
The first term of (3.35) is dependent on both the circuit design of the unit buffer and the
MPRO coupling configuration. The coupling configuration determines the value of the dominant
mode gain which in turn determines the value of the oscillation amplitude, Vo, the output
___
conductance current, Io(Vo), and the equivalent noise current, in2 [32][33]. The normalized power
dissipation, Pu, is also dependent on mode gain at low values of the mode gain where power is
dominated by the switching power, Pu,sw=Vdd(2Vo)Cgf. Power is impacted not only through Vo but
is also proportional to (nτ). At higher values of the dominant mode gain, the total power
remains roughly constant since the inverting buffers behave as class-A amplifiers as the
The second term in (3.35), FMPRO( x ), is only dependent on the coupling structure and not on
the unit buffer design. This term represents the filtering or the frequency selectivity of the
MPRO. In other words, it represents the effective quality factor of the oscillator. By optimizing
this term, we can determine the best phase noise performance of an MPRO given a certain mode
gain.
To derive an expression for FMPRO( x ), we first derive an expression for the transadmittance,
where ~x =[x2 x3 … xN]T and v =[v2 v3 … vN]T. We assume as before that x1=0. Therefore, the open-
gm go T
G j x v v1 (3.37)
1 j
55
To find the vector ( v /v1), we apply KCL at the node vk with k1. By defining the matrix X as
shown in (3.38) and the vector ~xF =[xN xN-1 … x2]T, we can express the vector ( v /v1) as shown in
1 j
g g x2 xN 1
m o
1 j
xN xN 2
X g m go (3.38)
x 1 j
x4
3
g m g o
v v1 X 1 x F (3.39)
g g T
G j m o 1
x X xF (3.40)
1 j
By differentiating (3.40) with respect to j and using the fact that G(jn)≈-1, the expression
for FMPRO( x ) can be written as (3.41). In this step, we use the effective transconductance and
output conductance, not the small-signal ones. Thus gm/go is equal to an.
2
FMPRO x n 1 x T X n 1 X n 1 x F
2
(3.41)
N
From (3.40) and using the values of the normalized frequency and the mode gain in (3.6) and
x T X n 1 x F x T Cn C FnT x F (3.42)
elements of x and xF and all the real and imaginary components of Xn-1 are greater than or equal
56
to 0 and with at least one element greater than 0, then, X n 1 xF Cn and xT X n1 CFnT
which means
2
FMPRO x n 1 CFnT Cn n
2 2
(3.43)
N
Interestingly, the value of FMPRO( x ) is equal to (n)2 hence reducing the expression for phase
noise to (3.44) and the FOM to (3.45). Since the unit power consumption is proportional, at
most, to (n) while FMPRO( x )=(n)2, maximizing the normalized oscillation frequency using
the algorithm explained in the previous section, also maximizes the FOM given a certain value of
2
in2 r 2 1 1
L n (3.44)
Vo 2
HN n 2
I 2 1
FOM o n
2
i 2 Pu (3.45)
n
The numerical value for the FOM depends on how the different terms in (3.45), other than
(ωnτ), vary as a function of the dominant mode gain. The analysis has been done for SPROs [34-
37]. The resulting expressions for FOM from [34-37] vary depending on the assumptions but the
final numerical values in all derivations predict an FOM in the vicinity of 170dB when the mode
In this work, we do not derive expressions for the different terms in (3.45). Although this is
possible, we believe that these expressions are of limited practical importance. The more
important conclusion in this paper is that for a certain value of the mode gain, maximizing the
2
An SPRO with a large number of stages is the case when the mode gain approaches unity.
57
normalized frequency gives the best FOM and accordingly, the best phase noise versus power
tradeoff.
Fig. 3.18 shows the simulated maximum FOM measured at a 10MHz offset frequency as a
function of the dominant mode gain for different channel lengths. These results are obtained by
simulating the MPROs in Table 3.2. The result for an1 is obtained using a 19-stage SPRO as an
example of a long chain SPRO and it agrees with the predictions in [34-37]. The maximum FOM
for an=2 is equal to that of a 3-stage SPRO as indicated in Table 3.2. In general, as the figure
shows, as the dominant mode gain increases, the maximum FOM decreases due to the faster
reduction in the first term in (3.45) compared to the possible increase in (ωnτ). Initially this
reduction is relatively fast for an between 1 and 3, and then the FOM stays relatively constant
until the mode gain approaches the small-signal dc gain of the unit inverter. At this point, the
Fig. 3.18. Simulated maximum FOM as a function of the mode gain. Phase noise is measured at a 10-MHz
offset frequency.
For applications requiring multiple clock phases, it is important to note that equation (3.45)
does not directly depend on the number of phases of the MPRO. As long as the mode gain and
the oscillation frequency stay constant, increasing the number of phases does not change the
58
FOM, theoretically. When the MPRO is sized correctly, the additional phases come at zero
power cost.
Table 3.3 shows the coupling configurations for a number of MPROs having different number
of phases spanning a range from 8 to 32. These coupling configurations have been obtained
using the optimization algorithm in Section 3.3.4 where mode stability is also taken into
consideration. The coupling configurations shown in the table are those obtained after quantizing
the resulting sizing factors into integers to be suitable for practical implementation.
The simulation results shown in Table 3.3 are obtained using the same unit cell in Fig.3.3 with
a channel length 200nm. As we can see from the table, although the number of phases spans 2
octaves, the FOM of all MPROs differs only by less than 0.5dB and their oscillation frequency
differs only by 3%. The FOM stays roughly constant because all of the oscillators are designed
to have approximately the same value for the dominant mode gain, an*, and the normalized
oscillation frequency, (nτ). If these MPROs are scaled such that all of them have the same total
power consumption then all of them would have the same thermal phase noise. These results
indicate that the oscillation frequency can be decoupled from the desired number of phases and
In practice, however, for a given power consumption, the number of phases cannot be
increased indefinitely. As the number of stages increases at the same total power, the driving
capability of the MPRO is divided among a greater number of stages reducing the driving
capability of each single phase. In addition, as the number of stages increases, the layout routing
becomes more complex leading eventually to a need for increasing the total oscillator power in
Based on the analysis we present in this paper, the design of a ring oscillator, given a certain
unit buffer design, should proceed in the following steps. First, the required number of phases,
the target oscillation frequency and the power budget should be specified. Second, the values of
the parameters ao, τo, and τp should be found by simulation for different channel lengths of the
devices. Third, using the optimization algorithm explained in Section 3.3, the optimum sizing
factors can be found. The constraint on maximum allowed mode gain must be decreased until the
minimum possible mode gain is found. This is done by running the algorithm and comparing the
oscillation frequency in (3.19) to the target frequency until the minimum mode gain is found. If
the required number of phases is odd and the target frequency is less than or equal the frequency
of an SPRO (3.20), the optimum design converges to an SPRO. Finally, the phase noise should
be simulated. If the phase noise performance is limited by flicker noise, then the overall
performance can be improved by using longer channel lengths for the devices at the cost of lower
speed. To compensate for the reduced frequency, the previous step should be repeated where the
optimum mode gain needs to be increased. In this way, the flicker noise performance is
60
3.4. Conclusion
In this chapter, a general model for multipath ring oscillators having arbitrary coupling
define the mode gain as the minimum buffer dc gain required to sustain a certain oscillation
mode. The value of the mode gain is only a function of the sizing factors of the different
coupling branches. The dominant mode is the oscillation mode with the lowest mode gain. We
show that several properties of the oscillator like the oscillation amplitude, the effective time
constant, the normalized power consumption and the effective noise factor are dependent on the
oscillator sizing only through the mode gain. An algorithm is introduced to calculate the
optimum sizing factors for the different coupling branches to achieve the maximum oscillation
frequency and the best phase noise taking into consideration the stability of the oscillation mode.
This chapter provides several guidelines for designing MPROs. The results of the
optimization show a simple heuristic for sizing an MPRO for maximizing the output frequency.
While frequency can be maximized by maximizing the mode gain of the dominant mode, a
designer should not choose a target mode gain near the maximum. Backing off from the
maximum avoids boundaries between modes, and improves an oscillator’s figure of merit.
Furthermore, the analysis indicates that for a given oscillation frequency and target mode gain,
the number of output phases can be increased without penalizing the figure of merit.
In the next chapter, the MPRO analysis is extended for optimizing the design of multiphase
frequency dividers. The divider is considered to be a further generalization of the multipath ring
61
CHAPTER 4
In this chapter, we extend the analysis we presented in the previous chapter to include
frequency dividers. We show that any latch-based divider can be modeled and analyzed as an
injection-locked oscillator. The locking range of this oscillator is dependent on the oscillator
structure and the injection strength. By deriving accurate expressions for the locking range, a
design procedure can be deduced to maximize the maximum division frequency. If the divider is
not required to operate at the maximum frequency, high-speed can be traded-off for power
savings.
at steady state to an input periodic signal with a frequency that is an integer multiple (i.e. a
typically used as a frequency divider. In this section, we propose a general model for a SHIL-
MPRO. This model can be used to find the optimal sizing factors and clocking configuration that
To explain our model, we begin first by a simple example of the divide-by-2 shown in Fig.
4.1(a). The divider consists of two latches driven by a differential clock input. Fig. 4.1(b) shows
a CML implementation of the latch while Fig. 4.1(c) shows a possible CMOS implementation. In
both implementations, the clock signal modulates the transconductance of the amplifying
62
transistors by modulating their bias current. For the CMOS implementation, the clock signal also
modulates the output conductance of the inverters since it is dependent on the bias current.
(a)
(b) (c)
Fig. 4.1. Latch-based divide-by-2 (a) Divider schematic (b) CML latch implementation (c) CMOS latch
implementation
As a result of this mixing effect, the output current of any of the transconductors forming the
divider consists of many harmonic components. One component is a linear product due to the
input of the transconductor. Other components are the different intermodulation products of the
latch input with the clock signal. Fig. 4.2 shows an equivalent block diagram for the quadrature
divider. The divider can be considered an injection-locked quad-phase MPRO. The injection
currents are the result of mixing the MPRO outputs with the different input clock phases.
63
Fig. 4.2. Equivalent model of the quadrature divider.
A general model for an M-phase input, N-phase output SHIL-MPRO is shown in Fig. 4.3. The
core oscillator is a general MPRO similar to that in Fig. 3.1. The parameters {hi} are sizing
parameters with respect to a reference unit inverter. The parameters {αi} take values 1 or 0
according to whether the ith branch of inverters is driven by the clock or not, respectively. We
assume in our analysis that a single-tone harmonic balance is accurate enough. Therefore,
assuming the MPRO is locked at a frequency ωo, the output voltages can be approximated by
(4.1). The transconductance of the inverter driven by the node vp and driving the node vq is
modulated by the clock signal u(q,p)(t) given by (4.2). We assume the input clock has a frequency
ωin and M distinct phases available. The clock phases distribution is determined by the
parameters (m1,m2) which can take integer values between 0 and M-1. By varying the values of
(m1,m2), all the possible uniform phase distributions can be modeled. Finally, the angle ψ is a
phase shift that depends on the injection frequency. In this analysis, we only consider the case of
tail injection. In general, direct injection can also be included by considering an additional set of
64
Fig. 4.3. General model of an M-phase input N-phase output SHIL-MPRO.
2 n i 1
vi t Vo cos ot (4.1)
N
2 m1 q 1 m2 p 1 (4.2)
u( q , p ) t U o cos int
M
The transconductance current of the inverter driven by vp and driving vq can be expressed by
(4.3). We assume the transconductance can be modeled as a memoryless nonlinear time varying
element. The first term in (4.3) represents the different components due to the transconductance
nonlinearity. The second component is the time variation function caused by the input clock
signal. This function is described by the coefficients αi,k. If the ith branch is clocked (αi=1), then
αi,k is equal to λk, the kth Fourier coefficient of the time modulation function. If the ith branch is
not clocked (αi=0), then αi,k is equal to unity for k=1 and is zero for k≠1.
2 n p 1
ig ( q , p ) hp q 1 g m,kVo cos kot k
k 1 N
(4.3)
2 m1 q 1 m2 p 1
p q 1,k cos kint k k
M
k 0
65
From all the frequency components given by (4.3), we are only interested in the component at
ωo. To calculate this component, we assume that at lock, the input frequency is an integer
multiple of the output frequency, that is ωin=dωo. By using this assumption, the
component of this current is the linear gm product. The second and third components are the
injection current components due to upper side and lower side injection, respectively.
2 n p 1
hp q 1 p q 1,0 g m ,1Vo cos o t
ig ( q , p )
o N
2 n p 1 2 m1 q 1 m2 p 1
V
o h p q 1 g m , kd 1 p q 1, k cos o t kd 1 k k
2 k 1
N M (4.4)
2 n p 1 2 m1 q 1 m2 p 1
Vo
h p q 1 g m , kd 1 p q 1, k cos o t kd 1 k k
2 k 1 N M
At any node vq of the oscillator, the harmonic balance equation can be written as (4.5) where
go and Cg are the output conductance and input capacitance of the unit inverter, respectively. The
N
parameter H hi is defined similar to the MPRO case. Substituting from (4.1) and (4.4) into
i 1
N dvq
i
i 1
g q ,i q 1
Hg o vq HC g
dt
0 (4.5)
N 2 n q 1 2 n i 1 2 n q 1 2 n q 1
h i i ,0 g m ,1Vo cos o t
N
N
Hg oVo cos o t
N
HC g oVo sin o t
N
i 0
Vo N
2 n q 1 2 n i 1 nd m2 nd m1 m2
2
h g i m , kd 1
i , k cos o t
N
N
k 2
N
M i 1 2 N M q 1
i 1 k 1
Vo N
2 n q 1 2 n i 1 nd m2 nd m1 m2
2
h g i m , kd 1
i , k cos o t
N
N
k 2
N
M i 1 2 N M q 1
i 1 k 1
0
(4.6)
66
By comparing the first line in (4.6) to equation (3.2), we see that it represents the behavior of
the MPRO. The second and third lines of equation (4.6) represent the injection current. For the
output of the SHIL-MPRO to have uniform phases, the gain and phase conditions of the
oscillator at lock should be independent on the node where the harmonic balance is derived.
From (4.6), we can see that this can only happen if the condition given by (4.7) is satisfied. This
condition represents the allowed input clock phase distributions that can result in uniform output
phases. For the oscillator outputs to have N distinct phases, the mode index n and the number of
stages N should have no common factors. This condition implies that the minimum number of
input phases, M, required to have uniform phases at the output throughout the whole locking
range is equal to the number of output phases, N, divided by the division ratio, d, or M≥N/d.
Equation (3.7) thus serves as the mathematical formulation for the progressive injection phase
m1 m2 nd
(4.7)
M N
To simplify the analysis, we consider only the lowest order intermodulation product and
assume that all the other components of the injection current can be neglected. Using this
assumption, (4.6) can be simplified to (4.8). The parameters hR,i HR, gR,o, and CR,g are equivalent
to the same parameters without the subscript ‘R’ in the free running MPRO. These are given by
N
hR,i=hiαi,0, H R hii ,0 , gR,o=goH/HR, and CR,g=CgH/HR, respectively. The injection current
i 0
amplitude is given by (4.9) where αi,1= αiλ1 and λ1 is the first Fourier coefficient of the time-
67
N
2 n q 1 2 n i 1
h g m ,1Vo cos o t
R ,i
N
N
i 0
2 n q 1 2 n q 1
H R g R ,oVo cos o t H R C R , g oVo sin o t (4.8)
N N
2 n q 1
I o cos o t
N
1 N
m n
Io
2
Hg m , d 1Vo x
i 1
i i exp j 2 1 i 1
M N
(4.9)
Equation (4.8) describes the injection-locked MPRO (IL-MPRO) shown in Fig. 4.4. The unit
inverter of this IL-MPRO has a transconductance gm,1, an input capacitance CR,g and an output
conductance gR,o. The mode gain and the normalized free running frequency of the equivalent
MPRO without injection are given by (4.10) and (4.11), respectively where xR,i=hR,i/HR. The
injection current at the ith node of the IL-MPRO is given by (4.12) where Io is expressed by (4.9).
The substitution Ψ=tan-1(ωR,nτ)+θ is made for convenience since it makes the arbitrary phase
shift θ equal to zero when the locking frequency is equal to the free running frequency of the
68
1
aR , n N
2 n i 1 (4.10)
xR ,i cos
i 1 N
N
2 n i 1
xsin
R ,i
N
i 1
R ,n N (4.11)
2 n i 1
xR ,i cos
i 1 N
2 n i 1
iinj ,i t I o cos ot tan 1 R,n (4.12)
N
Equations (4.7)-(4.12) represent the mathematical model we use for the SHIL-MPRO
analysis. We use this model in the next section to calculate the locking range of a SHIL-MPRO.
Fig. 4.5 shows a phasor diagram for an injection-locked MPRO at lock. The current Ig is the
amplitude of the total transconductance current injected into a certain node vq. This current is
given by (4.13) where the negative sign in the first line indicates the current direction out of the
transconductors. The total current injected into the oscillator node IT can be decomposed into a
resistive component IT,R that is in phase with the node voltage and an orthogonal capacitive
component IT,C. The total transconductance current Ig has a phase shift of tan-1(ωR,nτ) compared
to the resistive current. For the oscillator to be locked at a frequency ωo, the total current IT must
have a phase shift tan-1(ωoτ) with respect to the resistive current. The extra phase shift, φ =
tan−1(ωoτ) − tan−1(ωR,nτ), must be provided by the injection current. The angle θ is greater than or
less than zero depending on whether the injection frequency is greater than or less than the free
69
N
2 n q 1 2 n i 1
ig ,q t g m,1Vo hR ,i cos ot
i 1 N N
H g V 2 n q 1
R m,1 o 1 R ,n cos ot tan 1 R ,n
2
(4.13)
aR , n N
2 n q 1
I g cos ot tan 1 R ,n
N
The gain and phase conditions for locking can be derived from (4.8) by separating the
dc gain of the unit inverter at lock and (4.15) gives the phase condition from which the value of
the angle θ can be calculated. The injection coefficient η is defined in (4.16) where the value of
aR , n
an (4.14)
1 cos tan 1 R , n
sin tan 1 R , n
o
R ,n
(4.15)
1 cos tan 1 R , n
70
Io
1 R , n
2
Ig
aR , n I o
(4.16)
H R g m ,1Vo
1 H g m , d 1 N
m n
2
aR , n x i i exp j 2 1 i 1
H R g m ,1 i 1 M N
In the remainder of this section, we derive expressions for the different possible limitations on
The first limitation on the locking range is due to the maximum phase shift φmax that can be
caused by the injection current. This limitation defines the phase-limited locking range (PLLR)
[22]. Fig. 4.6 shows the phasor diagram of the IL-MPRO at the edge of the PLLR. An expression
of the maximum and minimum locking frequencies due to the PLLR can be derived from the
figure trigonometry or by differentiating equation (4.15) with respect to θ. The PLLR expression
Fig. 4.6. Phasor diagram for an IL-MPRO at the edge of the phase-limited locking range.
1 2 R ,n
2
Fig. 4.7 is a plot of the PLLR expression in (4.17) as a function of the injection coefficient η
for different values of the normalized free-running frequency (ωR,nτ). As the figure shows, for
71
small values of η, the locking range is a narrow range around the oscillator free running
frequency. As the injection strength increases, the locking range starts to increase. This behavior
is characteristic for weak injection and is explained in various previous publications as in [22].
However, as Fig. 4.7 shows, as the injection strength increases, the lower end of the PLLR
approaches zero while the upper end increases significantly until it tends to infinity when η
approaches unity. In practice, no matter how strong the injection level is, the locking range can
never tend to infinity which suggests the existence of additional limitations on the locking range
of a divider.
8
( R,n )=1
7
( R,n )=2
6 ( R,n )=3
( o )PLLR,(max/min)
4
3
2
1
0
0 0.2 0.4 0.6 0.8 1
The second limitation on the locking range arises from the gain condition in (4.14). For
locking to occur, the available dc gain of the inverters of the equivalent MPRO must be greater
than the value given by (4.14). Otherwise, the oscillator cannot be locked. This limitation defines
the gain-limited locking range (GLLR). An expression for the GLLR can be derived from (4.14)
72
and (4.15) and is given by (4.18). The parameter amax is the maximum allowed mode gain for the
IL-MPRO at lock (4.14) while aR,n is the mode gain of the equivalent free running MPRO (4.10).
For our argument in this subsection, amax is the small signal dc gain of the inverters of the
equivalent MPRO and is equal to (HR/H)ao where ao is the small signal dc gain of the actual
aR ,n
2
a
o GLLR MLLR max max R , n 1
2
(4.18)
aR ,n amax
For a certain value of (amax/aR,n), the upper end of the locking range becomes gain-limited
only when the injection coefficient, η, exceeds a certain threshold, ηg, given by (4.19). At this
value, both the PLLR and the GLLR have the same upper bound. When η exceeds this value, the
GLLR becomes the limit for the maximum locking frequency. The lower end of the locking
range is always phase-limited as long as amax≥aR,n which is the practical case of interest.
R ,n
2 2
a a a
R ,n (4.19)
R,n
g 1 R,n R ,n 1 R ,n
amax 2 4 amax amax
Fig. 4.8 shows a plot of the GLLR given by (4.18) for (ωR,nτ)=2 and (amax/aR,n)= 1, 2, and 3.
As the figure shows, for small values of η, the locking range is phase limited. When η exceeds
the value of ηg, the maximum locking frequency becomes gain-limited. The limit increases when
73
Fig. 4.8. Gain-limited locking range for an IL-MPRO with (ωR,nτ)=2 and (amax/aR,n) = 1, 2, and 3
The third limitation on the locking range arises for the existence of other possible modes of
oscillation for the MPRO. If the mode gain of the IL-MPRO (4.14) is higher than any of the
mode gains of the equivalent free running oscillator (4.10), this oscillation mode is able to build
up and the lock is lost. This limitation defines the mode-limited locking range (MLLR).
An expression for the MLLR is identical to that for GLLR if we define the maximum possible
mode gain amax as the minimum non-dominant mode gain. In general, equation (4.18) expresses
both the GLLR and the MLLR when amax is defined as given by (4.20).
H g
amax min a , min a
o l 0,1,..., N 1 R ,l 0
(4.20)
H ln
74
4.2.4. Quadrature-Phase Divider Example
divide-by-2 is used in this example. In the simulations, we use the unit cell shown in Fig. 4.9. It
consists of differential CMOS inverters switched at the common source nodes of both the NMOS
and the PMOS transistors. The clock signal is assumed to be strong enough to turn off the current
path when it is in low state. Accordingly, the time modulation function is assumed to be a 50%
duty cycle square wave with an average λ0=0.5 and a first harmonic component λ1=2/π. When a
certain cell is not clocked, this means that CK=VDD at all times. The dc gain of the unswitched
inverter for VDD=1V in 65nm CMOS technology is equal to 8. The clock signal is a rail-to-rail
square-wave with 10% rise and fall time. The simulated maximum division frequency shown in
this section is for the unloaded divider which represents the absolute maximum frequency a
divider can work at. The effect of loading is discussed in the next section.
Fig. 4.10 shows the schematic of a quadrature phase (N=4) SHIL-MPRO. The sizing factors
of the feed-forward and cross-coupling inverters are (h4,h3)=(h,1). The equivalent MPRO has
two distinct oscillation modes, n=3 is the desired mode and n=2 is an undesired mode. Assuming
75
the maximum number of distinct input phases is M=4, the input clock phases must satisfy the
Table 4.1 shows all the allowed clocking configurations that satisfy equation (4.7). The table
also shows all the quantities needed to calculate the three locking ranges as a function of the
sizing ratio, h.
h 0 1 a0 1
1 3
2 (1,0) X 0o VDD 180o VDD 1 0h h 0 1 0h h 0 a 0 1
2
h1 1 0 a0 1
3 (1,1) 0 0o 180o 180o 0o 1 h h h2 1 1.67
h1 20 0 a0 1
h1 1 0 a0 1
4 (1,1) 1 0o 90o 180o -90o 1 h h h1 1.67
h1 20 0 a0 1
h1 1 0 a0 1
5 (1,1) 2 0o 0o 180o 180o 1 h h h2 1 1.67
h1 20 0 a0 1
h1 1 0 a0 1
6 (1,1) 3 0o -90o 180o 90o 1 h h h 1 1.67
h1 20 0 a0 1
Fig. 4.11 shows the normalized maximum locking frequency (ωoτ)max for the 6 clocking
configurations, calculated by substituting the values in Table 4.1 into (4.17–4.20), and the
maximum locking input frequency obtained by simulation. For the six configurations, at small
76
values of h, the free running frequency, (ωR,3τ), is small and thus the MPRO mode gain, aR,n, and
the IL-MPRO mode gain, an, are both smaller than the maximum available dc gain. This causes
the locking range to be phase limited. As the value of h increases, the value of an at the edge of
the PLLR exceeds the available dc gain. Accordingly, the locking range becomes gain limited.
As h increases further, the value of the non-dominant mode gain, aR,2, decreases below the
available dc gain. This decrease in the non-dominant mode-gain casuses locking range to be
mode limited. For a 4-phase SHIL-MPRO using this particular unit cell (ao=8), the maximum
locking frequency occurs at the intersection of the GLLR and the MLLR. The value of the sizing
ratio that satisfies this condition, hGM, is calculated for each clocking configuration in the last
Fig. 4.11 shows that the optimum sizing ratio for maximizing the upper limit of the locking
range can be predicted with good accuracy. Note that while the sizing factor of the maximum is
accurate, the plot shows some discrepancy in the magnitude or shape of the curves. The
discrepancy is expected because the normalizing time constant, τ, for an MPRO is a function of
the mode-gain. Accordingly for a SHIL-MPRO, τ is a function of the sizing parameters and
injection strength (4.14). Meanwhile, the theoretical calculation uses a constant τ for
normalization.
When the SHIL-MPRO is sized to maximize the upper end of the locking range, (4.9)
indicates that the equivalent injection current is large. Coupled with the low quality factor of ring
oscillators, the oscillator has a locking range that extends down to low frequencies and is
sufficient for most practical applications. A more detailed application of this model is presented
in Section 4.4.
77
(a) (b) (c)
Fig. 4.11. Theoretical normalized maximum locking frequency and simulated maximum locking frequency
for an unloaded quadrature divide-by-2. (a) to (f) refer to the six clocking configurations in Table 4.1
respectively
To study the effect of loading, we first consider an arbitrary unloaded SHIL-MPRO with a
N
locking range between fL and fH and with sizing factors {hi} where H hi . For a fixed supply
i 1
voltage, the power consumption of this unloaded divider at lock is Pd(f). The power is only a
function of the injection frequency, fL ≤ f ≤ fH. The self-loading capacitance of this divider at each
of its outputs is equal to HCg where Cg is the equivalent input capacitance of the unit inverter as
defined before. The exact form of the function Pd(f) depends on the design of the unit inverter
78
and the divider architecture and is in general due to both the switching and the direct-path
currents.
If the whole unloaded divider is scaled with a factor Kd, the power consumption of the scaled
divider becomes KdPd(f) where fL ≤ f ≤ fH. If the scaled divider outputs are uniformly loaded with
a capacitance CL at each node, the divider behavior as a function of frequency scales in the
frequency axis by a factor (KdHCg+CL)/(KdHCg). Note that the RC ring-type ILO behavior is
always dependent on the normalized variable (ωoτ) rather than the absolute frequency. The
power consumption of the loaded scaled divider can be expressed by (4.21). Fig. 4.12(a) shows
an example for the frequency scaling due to loading. The figure shows the simulated power
consumption of the quadrature divider of Fig. 4.10 as a function of the injection frequency for
topology 2 in Table 4.1 which corresponds to Fig. 4.11(b). The divider uses the unit cell of Fig.
K d HC g C L K d HC g K d HC g
Pdiv f K d Pd f ; where fL f fH (4.21)
K HC K d HC g C L K d HC g C L
d g
To minimize power consumption, a divider must be scaled with the minimum scaling factor
Kd that still achieves the desired maximum operation frequency fmax<fH. This optimal scaling
factor and the corresponding power consumption can be expressed as (4.22) and (4.23)
respectively. In (4.23), Pu(fH) is the power consumption of a unit unloaded divider with a total
sizing factor H=1 at its maximum locking frequency fH and is equal to Pd(fH)/H. In practice, the
frequency fH should be chosen less than the actual maximum locking frequency by a factor large
enough to guarantee the divider achieves the desired frequency over all PVT variations.
79
2
1.5
(a)
100
Pdiv,opt(fmax ) / CL [ W / fF ]
80
60
40
20
0
0 10 20 30 40 50 60
fmax [GHz]
(b)
Fig. 4.12. Power consumption of a quadrature divider with sizing factors (h4,h3)=(3,1) and clocking
configuration (α4,α3)=(1,0) (a) As a function of input frequency for different load capacitances (b) As a
function of the maximum desired division frequency when the divider is scaled optimally.
CL f max
K d ,min ; f max f H (4.22)
HC g f H f max
80
f max Pd f H
Pdiv ,opt f max CL
f H f max HC g
(4.23)
f max Pu f H
CL
f H f max Cg
Fig. 4.12(b) is a plot of equation (4.23) where Cg=160fF for our unit cell in Fig. 4.9. If the
divider is always scaled for minimum power, the power consumption is always proportional to
the load capacitance. At frequencies that are small compared to fH, the power is proportional to
the maximum desired input frequency. As the input frequency approaches the maximum
injection frequency of the unloaded divider, fH, the required power increases rapidly with
frequency since the divider has to be scaled up in size to the limit where the external load
The analysis, in this section, shows that maximizing the maximum locking frequency of the
unloaded divider is useful even when a divider does not need to operate at such high locking
frequency since the divider can then be scaled down in size and save power compared to an un-
optimized design.
The mathematical model we derived in equations (4.7) through (4.20) can be used in an
optimization algorithm to find the optimum design. Fig. 4.13 shows a flow chart indicating the
proposed design procedure. In step A, we set the required initial parameters like the number of
input and output phases, the inverter dc gain, the range of sizing factors to search, and the
form the design space grid. This grid consists of all the possible sizing factors, oscillation modes,
and clocking configurations. We limit the number of signal paths per stage to a maximum of 3
81
paths. In general, if the maximum frequency required cannot be achieved using 3-paths, then the
maximum number of paths should be increased and the search should be repeated. However, it
should be noted that in practice, as the number of paths increases, the layout design of the divider
becomes more complicated and the benefit obtained from using more paths starts to diminish.
In step C, we exclude the undesired solutions from the grid before the actual computation of
the maximum locking frequency. These solutions include those that contain any two branches, i
and i+N/2 since they have opposite phases. We also exclude all the modes that do not lead to
distinctive output phases. These are the modes that do not satisfy the condition gcd(n,N)=1 where
gcd() is the greatest common divisor of two integers. We finally exclude all the solutions that
In step D, we use (4.16-4.20) to calculate the maximum locking frequency for all the
remaining solutions in the design space. We sort these solutions in step E into three groups
according to the number of clocked branches. We then sort the solutions in each group in
descending order according to the maximum locking frequency. Finally in step F, we select the
best solution and plot its maximum locking frequency as a function of the sizing factors and
compare that to simulation. Since we do not have to examine all the possible combinations, the
82
Fig. 4.13. Flow chart indicating the design procedure of a SHIL-MPRO divide-by-2.
83
4.5. Octal-Phase Divider
In this section, we apply our procedure to design a quadrature-phase input, octal-phase output
divide-by-2. This divider is the first stage of a divider chain used to generate the clock signals
required by a high speed serializing transmitter which is explained in the next chapter. The initial
parameters for this design are N=8, M=4, in addition to the other parameters given in step A in
Fig. 4.13. The design space formed in step B includes three different kinds of architectures as
shown in Fig. 4.14. Each of these architectures can have different clocking configurations where
we only show the one where one branch is clocked. The first architecture in Fig. 4.14(a) uses two
separate differential-input quadrature-output dividers with a 90o phase shift between their input
clock phases. The advantage of this realization is that the maximum division frequency is equal
to that of a quadrature divider. However, due to the symmetry of the divider, this architecture
suffers from 180o ambiguity in the phase relationship between the outputs of the two dividers. As
part of step C, this ambiguity cannot be tolerated in most applications where the sampling phases
condition can be identified whenever there exist isolated node sets in the divider.
Fig. 4.14(b) shows a second architecture where 4 cross-coupled latches are clocked with the
quadrature input clock to produce 8 phases at half the input frequency. This approach does not
suffer any phase ambiguity but our analysis, in addition to common practice, indicates that the
84
(a)
(b)
(c)
Fig. 4.14. Quadrature-phase input, octal-phase output divide-by-2 with three possible implementations as (a)
Two separate differential-input quadrature-output dividers (b) Conventional 4 latches in a single loop and
(c) Triple-path SHIL-MPRO.
Fig. 4.14(c) shows a third realization as a triple-path SHIL-MPRO. If the coupling structure
and clocking configuration of this realization are optimized, the design can achieve a maximum
division frequency that is equal to that of a quadrature divider. At the same time, since the design
has no isolated node sets, it does not suffer any phase ambiguity. In fact, one can see that the
85
third realization is a general case that reduces to the first one if h8=0 and to the second one if
h3=0. In general, a quad-path SHIL-MPRO can also be used and the maximum division
frequency in that case can theoretically be even higher than a quadrature-divider. However, the
We used our model in the search algorithm to examine all the possible triple-path
implementations and find the best coupling structure and clocking configuration (steps D and E).
Fig. 4.15 shows the simulation results for the triple-path octal-phase SHIL-MPRO in Fig. 4.14(c)
where only the branch, h3, is clocked (α8,α5,α3) = (0,0,1). The figure shows the simulated and
theoretical maximum locking frequencies as a function of the sizing factor, h3, for different
values of h8. The value of h5 is given by h5=H-h8-h3 where H is arbitrarily taken equal to 10. The
figure shows that our model can predict the optimum design with very good accuracy. By
comparing Fig. 4.15 and Fig. 4.11, we see that the maximum locking frequency for the octal-
phase divider is approximately the same as that for the quadrature one. This shows that using
SHIL-MPROs, the maximum locking frequency and the number of output phases can be
For the first stage in our divider chain, we use a triple-path SHIL-MPRO with
consumption while still meeting the maximum frequency target at the worst PVT case.
86
Fig. 4.15. Theoretical normalized (solid lines) and simulated (circles) maximum locking frequency for an
unloaded triple-path octal-phase divide-by-2 for (α8,α5,α3)=(0,0,1) and h5=10-h8-h3
The second stage in the divider chain used in our transmitter is an octal-phase-input,
are possible. Using the same design procedure, we chose a triple-path SHIL-MPRO with
(h16,h9,h7)=(2,2,6). Fig. 4.16 shows the clocking configuration and coupling structure.
87
Fig. 4.17. Theoretical normalized (solid lines) and simulated (circles) maximum locking frequency for an
unloaded triple-path hexadecimal-phase divide-by-2 for (α16,α9,α7)=(0,0,1) and h9=10-h16-h7
Fig. 4.17 shows that the simulation results for the selected architecture have good agreement
with the analytical model. The maximum frequency of the 16-phase triple-path SHIL-MPRO is
approximately half that of the 8-phase design. This is acceptable for our case since it does
operate at half the frequency of the octal-phase divider. However, if a higher maximum speed is
needed, 4 or more paths per stage should be used at the cost of a more complicated layout.
4.7. Conclusion
We propose in this chapter a comprehensive analysis and model for superharmonic injection-
locked multipath ring oscillators. The analysis shows that SHIL-MPROs inherit the property of
88
MPROs such that one can increase the number of output phases without penalizing the maximum
division frequency. We explain three different limitations that determine the maximum locking
range of the frequency divider: phase (PLLR), gain (GLLR), and oscillation mode (MLLR). The
PLLR corresponds to the maximum phase shift that an equivalent injection current can cause.
The GLLR corresponds to the finite dc gain of the inverters forming the divider. And the MLLR
corresponds to the existence of other non-dominant oscillation modes that at some point can
become dominant and prevent the SHIL-MPRO from locking in the desired mode. We derive
expressions for each of these three limitations on locking ranges as functions of the sizing factors
and input clock distribution. These expressions can be used in a design procedure to find the
optimum coupling structure and clocking configuration that maximizes the locking frequency of
the divider. We also explain the effect of loading the divider with an external capacitive load and
show that maximizing the locking frequency can be used to save power if the divider is then
operated at a lower frequency. The design procedure can be used for a wide range of applications
The next chapter describes the detail implementation of a serial link transmitter that uses the
dividers presented in Section 4.5 and 4.6 to achieve low area and power for the data serialization.
89
CHAPTER 5
The ability to achieve frequency division with multiple phases at relatively high frequency is
used to relax the timing requirements of the serializer hence allowing us to eliminate the need for
any delay calibration loops. This chapter applies this idea to a serial-link transmitter. We first
explain the system design of the transmitter in the first section and then give the details of the
Fig. 5.1 shows a block diagram of a serializing transmitter. The data path consists of a 64-to-1
serializer followed by a two-stage driver. The sampling clock signals required for the different
serializer stages are generated from a charge-pump phase-locked loop. An LC-VCO drives the
final N:1 multiplexer with an N-phase clock at a frequency fo/N where fo is the final bit rate. The
VCO clock is then divided by a chain of frequency-dividers to generate the lower frequency
To minimize the chip area of the transmitter, the use of inductors has been restricted only to
the VCO, the final multiplexing stage, and the following pre-driver and driver. All the other
CMOS. In addition, to avoid the overhead added by delay-calibration loops or delay matching
buffers, the serializer must have a wide enough operation range to guarantee the timing
constraints are met for the desired data rate over all PVT corners.
90
Fig. 5.1. Block-diagram of a 64-to-1 serializing transmitter.
In the following subsections, we explain the different architectural choices that we made to
As explained in Chapter 2, as the order of the final multiplexer (N) increases, the timing
constraints at the final multiplexing stage become more relaxed. However, this has two
drawbacks. First, higher order multiplexers suffer lower bandwidth which means higher power
consumption to meet a certain target data rate. On the other hand, as N increases, the VCO output
frequency decreases which lowers the power consumption requirements of the divider chain as
explained in Chapter 4. Accordingly, to select the value of N that minimizes the total power
consumption, this trade-off has to be analyzed. The second problem is that when N increases, the
output signal becomes more sensitive to phase mismatches in the multiphase VCO. In addition, if
an LC-VCO is used, the number of inductors in the VCO increases increasing the chip area.
91
By investigating the power trade-off between the MUX and the clock dividers, we found that
for the target data rate of ~40 Gb/s using 65-nm CMOS technology, a 4-to-1 MUX at the final
stage is the best compromise. If a 2-to-1 MUX is used, the first divider has to be clocked by a
~20-GHz input clock which requires burning a lot of power if inductive peaking is to be avoided.
On the other hand, if an 8-to-1 MUX is used, the power consumption required for the MUX to
meet the bandwidth requirement becomes excessive. In addition, a 4-to-1 multiplexer was found
Another design trade-off is the number of phases in the divided clock signals. Fig. 5.2 shows
the number of latches required to correctly retime the data input to the final 4:1 MUX. If the first
divided clock has a differential phase, the 4 outputs of the preceding 8:4 stage will have
transitions all at the same phase. Accordingly, the data has to be retimed before being applied to
the 4:1 MUX. Fig. 5.2(a) shows a possible implementation where a minimum of 7 latches is
required before the multiplexer. If the divided clock has a quadrature phase, the 4 data inputs will
have 2 transition phases which relaxes the number of latches to a total of 6 latches as shown in
Fig. 5.2(b). Finally, if the divided clock has 8 phases, the input data is already retimed and the
Fig. 5.2 shows that if we can generate 8 phases for the divided clock, power and area savings
can be achieved by minimizing the required sampling elements at the input of the multiplexer.
This approach can be applied to all stages of the serializer. In general, for any multiplexing stage,
if the time resolution of the divided clock is equal to that of the input clock, the number of
92
retiming elements can be minimized. Time resolution can be maintained if the number of phases
(a)
(b)
(c)
Fig. 5.2. Required retiming elements at the input of the final 4-to-1 multiplexer when the divided clock has
(a) Two phases. (b) Four phases. (c) Eight phases.
93
Based on the analysis in Chapter 4, multiphase dividers can be designed as superharmonic
Fig. 5.3 shows the proposed transmitter architecture. The transmitter consists of a 64:1
serializer followed by a two-stage driver. The final stage in the serializer is a 4:1 MUX that
includes pulse generation and latching functionality. The required sampling clock signals are
generated from a phase-locked loop. A quadrature phase LC-VCO generates 4 phases at 12 GHz
that directly drive the final MUX and the first frequency divider. High speed, low power
frequency dividers are realized as superharmonic injection-locked multipath ring oscillators. The
dividers generate 8 phases at 6 GHz, 16 phases at 3 GHz, 8 phases at 1.5 GHz, and 16 phases at
750 MHz.
Since the number of phases at the output of each of the first and the second dividers is double
the number of phases at their inputs, the number of latches at the input of the 8:4 MUX and the
4:1 MUX can be minimized to a single latch per input. This saves a considerable amount of
power. In principle, it is possible to maintain the timing resolution for the whole divider chain.
However, due to the increased layout complexity of multiphase dividers and the lower power of
the low speed stages, the third divider is designed to have only 8 phases instead of 32. This
increases the number of latches required at the input of the 16:8 MUX to 1.5 latches per input
where the MUX itself is used as a latch. The final divider generates 16 phases at 750 MHz
maintaining the timing resolution of the preceding clock and thus allowing the use of a single
latch per stage for the 32:16 MUX. Since the timing constrains are relaxed at this low frequency,
94
we eliminate this latch. In effect, this means that the 32:16 MUX has an effective φ=0.5 since it
or (231-1) that is used as an input data source for the TX [48]. The PRBS generator can also be
bypassed and a 210-bit arbitrary sequence can be used as the input data. The data source is driven
by one of the 750 MHz clock phases and its output is retimed and applied to the input of the
serializer.
To be able to measure the actual waveforms on the chip, we included an on-chip eye
monitoring system. A 6-bit phase interpolator [60] and a simple 8-bit R-2R DAC provide
variable time and voltage references for a high bandwidth comparator. The comparator
subsamples the output of the transmitter at 1/64 of the final data rate. The output of the
comparator is then compared in the control unit to one of the input channels of the transmitter
and bit errors are counted in a 32-bit counter. The measurement is repeated for every input clock
95
phase and reference voltage and the results are stored in a scan chain that is read in real time
Although the output is subsampled, the sampling aperture of the comparator has to be narrow
enough to avoid filtering the measured signal. In fact, the bandwidth requirements of the
comparator used here are the same as that that would be used in the receiver. The design of the
96
5.2. Circuit Details
This section explains the details of the different circuit blocks used in the transmitter. We
begin with the building blocks of the phase-locked loop then we explain the design of the
Fig. 5.5 shows the schematic of the 12-GHz quadrature LC-VCO. Using the same naming
convention used for MPROs in Chapter 3, the cross-coupled inverters are sized by a scaling
factor h3 while the coupling inverters are sized by h4. An expression for the oscillation frequency
of the quadrature oscillator can be derived using a similar analysis to that of MPROs. In the
(a) (b)
Fig. 5.5. Schematic diagram of the quadrature LC-VCO: (a) VCO. (b) Unit capacitor used for coarse tuning.
2
R
osc R o 2 (5.1)
2 2
In (5.1), ωo is the resonance frequency of the equivalent tank circuit while ωR is the
oscillation frequency of the core ring oscillator that results if the equivalent inductance of the
97
tank circuit is eliminated. From the analysis in Chapter 3, As the coupling ratio (h4/h3) is
increased, the value of ωR increases causing the oscillation frequency to deviate more from
resonance. This increase in the coupling ratio improves the quadrature phase accuracy at the cost
In practice, supply noise is a more serious source of jitter compared to the device noise. As
the supply voltage increases, both components that determine the oscillator frequency, ωR and
ωo, change in opposite directions. On one hand, the frequency of the core ring oscillator
increases with the supply voltage due to the increased current drive of the inverters. On the other
hand, the resonance frequency of the tank circuit decreases with supply voltage due to the
increase in the value equivalent tank capacitance [39]. The variation in oscillation frequency due
to supply variation can be calculated by differentiating (5.1) as given by (5.2). Since the first
term due to the core ring oscillator is always positive while the second term that represents the
tank resonance is always negative, a value of h4/h3 exists that causes the supply sensitivity to be
nulled.
Fig. 5.6 shows simulation results for the supply sensitivity defined by (5.3) and the oscillator
figure of merit (2.2). In addition to improving the phase accuracy, increasing the coupling ratio
can improve the supply rejection by more than 20dB at the cost of ~4dB degradation in random
phase noise for the same power consumption. In a typical mixed-signal chip environment, this
leads to an overall improvement in the overall jitter. Accordingly, in our design, we use a
98
osc osc
SVDD 20 log (5.3)
Vdd Vdd
Fig. 5.6. Effect of varying the coupling ratio (h4/h3) on the supply voltage sensitivity and the oscillator
figure of merit
The divider-chain consists of four stages. The first and second stages are explained in
Sections 4.5 and 4.6, respectively. They are realized as octal-phase and hexadecimal-phase
superharmonic injection locked triple-path ring oscillators. The first stage has a coupling
stage has (h16,h9,h7)=(2,2,6) and (α16,α9,α7)=(0,0,1). These sizing ratios maximize the maximum
locking frequency as explained in the previous chapter with only one clocked branch to simplify
For the octal-phase divider in the third stage, we only used a conventional dual-path design as
the one in Fig. 4.15(b). The reason for this choice is because at the third stage, the frequency is 4
times less compared to that at the first stage and the loading capacitance is also less due to the
99
fanout in the transmitter data path. Accordingly, the optimally scaled triple-path design would
reach the limit of the minimum device size that we set and the dual-path design becomes more
power efficient due to reduced self-loading. Similarly, for the final hexadecimal-phase divider,
we used a dual-path design using 8 differential cross-coupled latches. The sizing ratios for the
third and fourth stages are (h8,h5)=(3,1) and (h16,h9)=(3,1), respectively. In both stages, only the
feedforward inverters are clocked. That is (α8,α5)=(1,0) for the octal-phase divider and
Fig. 5.7(a) shows the block diagram of a conventional tri-state phase/frequency detector
(PFD) [51]. The flip-flop can be implemented as a TSPC-FF with an asynchronous reset signal
as shown in Fig. 5.7(b). Since the D-input of the flip-flop is always equal to ‘1’, node ‘X’ in Fig.
5.7(b) is always pulled down. Accordingly, both the first stage and the pull-down network of the
second stage can be removed without affecting the functionality of the flip-flop when used in a
PFD. Since the output has 0-to-1 transitions only when the reset input is activated, M6 in the
third stage can also be removed. The flip-flop thus reduces tot eh simplified form in Fig. 5.7(c).
Fig. 5.7(d) shows the schematic of the proposed PFD. It consists of two simplified dynamic
latches that require only a single phase for both the reference and divided clock inputs.
Minimizing the number of stacked devices allows the PFD to have sharp output transitions.
Simulations show fast rising/falling edges for frequencies up to more than 2 GHz.
100
(a)
(b) (c)
(d)
Fig. 5.7. Evolution of the proposed phase/frequency detector (a) Block diagram of a conventional PFD
(b) TSPC flip-flop with an asynchronous reset (c) Simplified TSPC-FF assuming D=1 (d) Proposed PFD
101
5.2.4. Charge-Pump
Fig. 5.8 shows the schematic diagram of the charge pump. Dual current compensation [52] is
used to achieve constant and matched UP/DN currents. In addition, current steering and OpAmp
biasing of node X reduces the dynamic mismatch [53]. Since the loop filter capacitors are
implemented using metal fringing capacitors, no leakage compensation is needed in the charge
pump. Due to the small static and dynamic mismatch, the measured spur level is less than −74
The final stage in the serializer is a 4-to-1 multiplexer that operates up to 48 Gb/s. The MUX
is clocked with a quadrature-phase 12 GHz clock generated directly from an LC-VCO. Fig.
5.9(a) shows a block diagram of the multiplexer. Each unit cell generates a 25% duty-cycle pulse
by ANDing two adjacent clock phases. This pulse is used to connect one data input to the output
as shown in Fig. 5.9(b). The outputs of the 4 unit cells are added in the current-domain and
converted into a voltage by the output impedance of the multiplexer. Inductive shunt peaking is
5.10(a), a stack of 3 NMOS devices is used to perform the ANDing and the sampling operations
in a single stage. This technique is used in [54] to implement a 10 Gb/s 4:1 MUX in 0.13μm
CMOS technology. Although the number of internal nodes is minimal, the devices have to be
large enough to provide the current drive needed to produce the desired swing. Scaling up the
devices increases the capacitive loading of the VCO and the preceding latch which increases the
(a)
(b)
Fig. 5.9. Block diagram (a) and timing diagram (b) of a 4:1 multiplexer.
103
(a) (b)
(c)
Fig. 5.10. Different possible implementations of the 4:1 MUX unit cell
To improve the current drive, a two-stage implementation can be used where pulse generation
is achieved using a separate AND gate as shown in Fig. 5.10(b). Since the AND gate is directly
driven by the sinusoidal VCO outputs, the output of the first NAND gate is not rail-to-rail and
the inverter burns an excessive amount of power due to both the switching and the direct-path
currents.
A modified two-stage implementation is shown in Fig. 5.10(c) [55]. Since the AND gate
output duty-cycle is increased to 50% and its switching activity is reduced by a factor of 2, the
power consumption can be reduced. In addition, the first stage now acts as a dynamic latch
where node X is pre-charged every clock cycle and then conditionally discharged by the input
The proposed implementation of the multiplexer unit cell is shown in Fig. 5.11(a). It is a
modified version of the design in [55] where the input latch is realized in PMOS. At the rising
104
edge of CK1, L1 holds the sampled data input at the internal capacitance of node X which then
propagates to the output as shown in Fig. 5.11(b). At the rising edge of CK2, node X is
discharged causing the output current to turn off. Since node X is pre-discharged rather than pre-
charged, the inverter is not needed and can be removed. The proposed design minimizes the
Fig. 5.11(c) shows the schematic diagram of the proposed 4-to-1 multiplexer. It uses a
differential version of the proposed unit cell. The transistor M1 is shared between the 4 branches
to provide a higher current drive for the discharging branch. Since each clock phase drives both
clock inputs in two different unit cells and each cell has a differential data input, the clock load is
symmetric for all phases and no phase mismatch can result from load imbalance.
(a) (b)
(c)
Fig. 5.11. Proposed 4-to-1 multiplexer (a) Unit cell (b) Timing diagram (c) Complete MUX
105
5.2.6. Comparator Design
The comparator is used in the on-chip eye-monitoring system to sample the output waveform
of the transmitter. Accordingly, the sampling aperture of the comparator has to be narrow enough
such that it does not introduce significant filtering for the measured waveform.
Fig. 5.12(a) shows the schematic of the proposed comparator design. The comparator uses
both edges of the input differential clock to operate at double the input clock rate. During one of
the two input clock phases, one channel uses negative feedback to extend the tracking bandwidth
at the cost of the tracking gain. At the same time, the other channel uses positive feedback to
achieve regeneration. The switching between tracking and regeneration is achieved by reversing
the output polarity of the feedback stage as shown in Fig. 5.12(b). The design uses drain
switching rather than source switching to minimize the transition time from tracking to
Fig. 5.13 shows the simulated sampling function of the comparator. The simulation is
performed using the periodic steady-state (PSS) and the periodic ac (PAC) analyses in spectreRF
as explained in [61]. The sampling instant is taken at the end of the regeneration period just
before the clock transition. The simulation is done using a 10 GHz input clock which
corresponds to a sampling rate of 20 GS/s. Fig. 5.13 shows that the sampling aperture is
106
(a)
(b)
Fig. 5.12. Proposed drain-switched comparator: (a) Schematic diagram. (b) Equivalent block diagram.
107
Fig. 5.14 shows the simulated sensitivity of the comparator. A ‘00110011’ sequence is
applied to the comparator input at double the clock rate. This causes a ‘0101’ output at each of
the comparator outputs. The time shift between the data transition and the clock edge is varied
and each time, the minimum signal amplitude needed to obtain correct outputs is found. The
minimum sensitivity occurs for a data-to-clock delay of ~30 ps which is equal to the delay of the
comparator, as can be seen from Fig. 5.13, and is less than 3-mV.
5.3. Conclusion
Serial link transmitters are systems that are inherently parallel and thus ones that can greatly
benefit from multiphase clocking. In this chapter we showed how multiphase sampling can be
used to optimize the operation bit-rate window such that the transmitter can operate for a wide
range of bit rates. This leads to designing transmitters that are robust to PVT variations and
eliminate the need for any delay matching buffers of delay calibration loops leading to a simpler,
108
Multiphase sampling can also allow the designer to minimize the number of sampling
elements between the different serializer stages which are needed for resynchronization. If the
timing resolution of the sampling clock is maintained between different stages, the required
number of latches can be reduced by 43% which leads to significant improvement in power and
chip area provided that phase generation can be achieved at no extra overhead. This is
Various circuit techniques are proposed to improve the power efficiency and performance of
the transmitter. An accurate quadrature-phase low supply sensitivity VCO design has been
explained to minimize the deterministic jitter at the output of the transmitter. A novel PFD and
an accurate charge pump reduce the reference spur level to less than -74 dBc which leads to
negligible sinusoidal jitter at the output. A new 4-to-1 multiplexer design, that achieves the
desired bandwidth requirements at low power, is proposed. And finally, a low sampling aperture
comparator using drain-switching that is used in an on-chip eye monitoring system is explained.
In the next chapter, we show the measurement results of the fabricated transmitter chip.
109
CHAPTER 6
Experimental Results
The transmitter explained in the previous chapter is fabricated in 65-nm 1P9M CMOS
technology and tested in a chip-on-board assembly. Fig. 6.1 shows the chip and the printed
circuit board photographs. The chip occupies a total area of 1.2 mm2. In this chapter, we explain
the details of the testing setup and discuss the measurements results.
(a) (b)
Fig. 6.2 shows a block diagram of the testing setup. The transmitter output is probed using a
50-GHz bandwidth GSSG cascade probe. The probe is either connected to a 20-GHz bandwidth
generates a 750-MHz reference clock that is used both as a reference to the PLL and to trigger
the scope. Different control bits are written into an on-chip scan chain using a pattern generator
110
and can be read out using a logic analyzer. The interface between the pattern generator, the logic
analyzer, the chip and the PC is controlled by a National Instruments data acquisition (NI-DAQ)
interface.
Fig. 6.3 shows the measured PLL output spectrum at 10.56-GHz output. To test the PLL, a
‘0011’ sequence is applied to the serializer causing the final 4:1 MUX, the pre-driver, and the
driver to act as a 3-stage clock buffer for the VCO output. The locking range of the PLL extends
111
from 7.9 GHz to 12.1 GHz limited only by the VCO tuning range. The worst reference spur level
over the locking range of the PLL is −74 dBc for a carrier frequency of 10.56 GHz.
-10
-20
-30
SPECTRUM (dBm)
-40
-50
-60
-70
-80
-90
Fig. 6.3. Measured spectrum of the PLL output for a 10.56 GHz output carrier frequency
Fig. 6.4 shows the measured single side-band phase noise for a carrier frequency of 12.08
GHz near the upper end of the locking range. The out of band phase noise at 10 MHz offset is
−127.5 dBc/Hz. The small bump around 1 MHz is due to some under-damping in the loop
dynamics. The integrated phase noise from 100 Hz to 10 MHz is equal to 1.09 degree-rms which
corresponds to 251 fsecrms random jitter. As one would expect from the very low spur level, the
112
-60
-100
-120
-140
Such low jitter numbers are not measurable using our equipment. Fig. 6.5 shows the time
domain measured jitter histogram. The measured jitter is less than 0.66 psecrms limited by the
113
The serializer operates properly over the whole locking range of the PLL. This range
corresponds to data rates of 32-48 Gb/s. This wide operation range is due to the optimum choice
of the sampling phases at all the multiplexing stages. Fig. 6.6 shows the measured single-ended
eye-diagram at the output of the transmitter at 48 Gb/s while Fig. 6.7 shows the measured eye
using the on-chip eye monitoring. The vertical eye opening measured on chip for the single-
ended output is ~270 mV while that measured by the scope is ~150 mV. This difference is due to
the attenuation caused by the probes, the 48” cables and the sampling front-end of the digital
114
1.2
1.1
1.0
[ Volt ]
0.9
0.8
0.7
0.6
-0.5 0 0.5
[ UI ]
Fig. 6.7. Eye diagram measured using on-chip eye monitor.
The chip operates at data rates 31.68−48.4 Gb/s. It consumes a total power of 88 mW from a
1.2 V supply. Table 6.1 shows the power breakdown of the transmitter while Table 6.2 compares
the achieved performance with other serializing transmitters operating at similar data rates. The
use of high-speed power-optimized multiphase frequency dividers allows us to remove all forms
of delay adjustment overhead that is used in other designs. In addition, it allows us to minimize
the number of retiming latches in the data path and to realize the whole divider chain in CMOS
without using any inductors. All of which leads to the significant power and area savings
compared to previous work. These savings are achieved with a phase noise and jitter
115
Table 6.1. Power breakdown for the transmitter chip.
Buffer/PFD/CP 2 2.3
Predriver/Driver 26.4 30
Serializer 15 17
Total 88 100
116
CHAPTER 7
Conclusion
We present in this dissertaion an accurate model and a comprehensive analysis that explain
the operation of multipath ring oscillators [57] and frequency dividers [58]. We showed that
using multipath coupling, the number of phases of a ring oscillator or a frequency divider can be
increasing the number of phases, it is possible to push the maximum operation frequency to
higher values at the cost amplitude reduction or equivalently degraded phase noise. We show
that the only limitation on the maximum operation frequency of a ring oscillator or a frequency
divider is the cut-off frequency of the technology and not the number of phases needed at its
output.
The mode stability problem which is characteristic of multipath ring structures is part of the
analytical study. To help designers traverse the large design space, a design procedure based on a
linear optimization algorithm is provided to allow a designer to quickly arrive at the optimum
coupling structure that maximizes the oscillation frequency given a certain desired number of
phases and phase noise performance. This optimum structure is chosen while considering mode
stability.
The oscillator analysis is further extended to apply to frequency dividers. The locking range
strong signal injection. We show that, unlike the previously studied weak injection scenario, the
117
actual locking rang of the divider is not actually phase limited. Instead, it is either limited by the
maximum gain available in the circuit or by the existence of other non-dominant modes that can
Similar to our work on oscillator design, a design procedure that uses the derived
mathematical model is proposed to help a designer to quickly arrive at the best divider design
The ability to generate multiple phases at high frequencies and with low power is useful in
numerous applications. As an example, we showed how this can help design a high speed serial
link transmitter allowing it to operate at wide range of data rates without the need for any delay
adjustment overhead. As shown in out measurement results, the architecture also minimizes the
number of high-speed latches which results in significant power savings in comparison with
The analyses presented in this work can be extended to many of the applications mentioned in
Chapter 1. One example is to design a complete clock network as one big coupled oscillator. The
effect of different loads and routing parasitics can be taken into account and compensated for by
proper adjustment of coupling strengths. Since the figure of merit an optimally designed
multipath ring oscillator can be maintained regardless of the number of phases it has, this
suggests that as the clock network grows in size, the resulting phase noise or jitter at any of its
outputs can actually decrease. This conclusion is opposite to common wisdom which suggests
that as more buffering elements are added, the jitter performance should degrade.
Another extension is to perform this coupling in clock networks and embed the frequency
dividers. This extension is not unlike the clock divider approach described in this dissertation
118
except the feedback divider can be considered as part of the oscillator itself as opposed to an
explicit element downstream from the oscillator. In addition to reducing jitter, inter-stage
coupling allows the design of a clock network with accurate and well defined phase relationships
119
Appendix
Fig. A.1 shows the general form of a multipath LC oscillator (MPLCO). Similar to the
MPRO, we assume all coupling paths exist. Each row of inverters is sized by a sizing factor, hi,
with respect to a certain reference inverter where i=1,2,3,…,N. The unit inverter has an
equivalent input capacitance, Cg, an output conductance, go, and an equivalent transconductance,
gm. Unlike the MPRO, each of the output nodes of the MPLCO is loaded with an LC tank circuit
having a capacitance, Ct, and an inductance, L. The finite quality factor of the inductance is
The output voltage vi can be expressed by (A.1) where Vo is the oscillation amplitude, ωn is
the oscillation frequency, and n=0,1,2,…,N-1 is the oscillation mode index. The gain and phase
120
oscillation conditions can be derived by applying KCL at any of the oscillator nodes similar to
the MPRO analysis. Equation (A.2) shows the harmonic balance equation at node v1.
2 n i 1
vi t Vo cos nt (A.1)
N
N
2 n i 1 gt N
h g V
i m o cos
n t
N
o hi Vo cos n t
g
i 1 g o i 1
(A.2)
Ct N 1
n Cg hi Vo sin nt V sin n t 0
Cg i 1 n L o
By separating the coefficient of cos(ωnt), the gain condition can be derived and the mode gain
can be expressed by (A.3) where aR,n is the mode gain of the core MPRO that results if the LC
N
tanks are disconnected (3.8) and H hi is defined similar to the previous MPRO and SHIL-
i 1
MPRO analyses. Equation (A.3) shows that the dc gain required from the inverters in the case of
an MPLCO is slightly higher than that for an MRPO due to the loading caused by the finite
1 gt Hg o
an
2 n
i 1
N
xi cos (A.3)
i 1 N
1 gt Hg o aR ,n
The phase condition can be derived by separating the coefficient of sin(ωnt) in (A.2). Due to
the presence of the inductor, the phase condition is a quadratic equation of the oscillation
frequency that has two solutions rather than just one as for an MPRO. The oscillation frequencies
of the nth mode can be expressed by (A.4) where ωR,n is the oscillation frequency of the core
MPRO if the tank circuits are disconnected, expressed by (3.7), and ωo=[L(Ct+HCg)]−1/2 is the
121
resonance frequency of the tank circuit loaded with the inverters. Equation (A.4) shows that if
the core MPRO does not have a zero frequency, the oscillation frequency of the MPLCO can
take one of two values that is either higher than or less than the resonance frequency. This each
value of the mode index, n, corresponds to two oscillation modes. The first mode has a higher
oscillation frequency than the resonance frequency where the tank load appears capacitive and
thus we refer to it as the capacitive mode and we denote it with the subscript ‘C’. The second
mode has a lower frequency than resonance where the tank circuit appears inductive and thus we
refer to it as the inductive mode and we denote it with the subscript ‘L’.
2
1 gt Hg o R , n 1 gt Hg o
n C L R , n o 2 (A.4)
2 1 Ct HCg 2 1 Ct HCg
The analysis in its current form does not show whether the capacitive or the inductive mode is
dominant. This is due to the symmetry of the impedance of the tank load as a function of
frequency. In practice, the finite quality factor of the inductor is more precisely modeled as a
series resistance rather than a parallel conductance [56]. In this case, the equivalent parallel
conductance of the tank circuit is lower at higher frequencies causing the mode gain of the
capacitive mode to be lower than that of the inductive one and thus becomes more dominant.
If the quality factor of the tank circuit is limited by that of the capacitance rather than the
inductance, the situation is reversed where the inductive mode becomes the dominant mode. This
situation, however, is rare in practice and it can easily be avoided by good layout design.
In the special case of a cross coupled oscillator, the oscillation frequency of the core MPRO is
122
References
[1] Anantha P. Chandrakasan, Samuel Sheng, and Robert Brodersen, “Low-Power CMOS
Digital Design,” IEEE Journal of Solid-State Circuits, vol. 27, no.4, pp. 473–484, Apr. 1992.
[2] Zhiyu Ru, Niels A. Moseley, Eric A. M. Klumperink, and Bram Nauta,“Digitally
Enhanced Software-Defined Radio Receiver Robust to Out-of-Band Interference,” IEEE Journal
of Solid-State Circuits, vol. 44, no. 12, pp. 3359–3375, Dec. 2009.
[3] David Murphy, Amr Hafez, Ahmad Mirzaei, Mohyee Mikhemar, Hooman Darabi, Mau-
Chung Frank Chang, Asad Abidi, “A Blocker-Tolerant Wideband Noise-Cancelling Receiver
with a 2dB Noise Figure,” International Solid-State Circuits Conference (ISSCC), pp. 74–75,
Feb. 2012.
[4] Xiang Gao, Eric A. M. Klumperink, and Bram Nauta, “Advantages of Shift Registers
Over DLLs for Flexible Low Jitter Multiphase Clock Generation,” IEEE Transactions on
Circuits And Systems – II: Express Briefs, vol. 55, no. 3, pp. 244–248, Mar. 2008.
[5] Jeong-Kyoum Kim, Jaeha Kim, Gyudong Kim, and Deog-Kyoon Jeong, “A Fully
Integrated 0.13-μm CMOS 40-Gb/s Serial Link Transceiver,” IEEE Journal of Solid-State
Circuits, vol. 44, no. 5, pp. 1510–1521, May 2009.
[6] Jaeha Kim, et al.,“A 20-GHz Phase-Locked Loop for 40-Gb/s Serializing Transmitter in
0.13-μm CMOS,” IEEE Journal of Solid-State Circuits, vol. 41, no. 4, pp. 899–908, Apr. 2006.
[7] Shunichi Kaeriyama et al., “A 40 Gb/s Multi-Data-Rate CMOS Transmitter and Reciever
Chipset With SFI-5 Interface for Optical Transmission Systems,” IEEE Journal of Solid State
Circuits, vol. 44, no. 12, pp. 3568–3579, Dec. 2009.
[8] Kouichi Kanda, et al., “A Single-40 Gb/s Dual-20 Gb/s Serializer IC With SFI-5.2
Interface in 65 nm CMOS,” IEEE Journal of Solid State Circuits, vol. 44, no. 12, pp. 3580–3588,
Dec. 2009.
123
[9] Nikola Nedovic, et al., “A 3 Watt 39.8-44.6 Gb/s Dual Mode SFI5.2 SerDes Chip Set in
65 nm CMOS,” IEEE Journal of Solid State Circuits, vol. 45, no. 10, pp. 2016–2029, Oct. 2010.
[10] John G. Maneatis and Mark A. Horowitz, “Precise delay generation using coupled
oscillators,” IEEE Journal of Solid-State Circuits, vol. 28, no. 12, pp. 1273–1282, Dec. 1993.
[11] S. J. Lee, B. Kim, and K. Lee, “A novel high-speed ring oscillator for multiphase clock
generation using negative skewed delay scheme,” IEEE Journal of Solid-State Circuits, vol. 32,
no. 2, pp. 289–291, Feb. 1997.
[12] Lizhong Sun and Tadeusz A. Kwasniewski, “A 1.25-GHz 0.35-m Monolithic CMOS
PLL Based on a Multiphase Ring Oscillator,” IEEE Journal of Solid-State Circuits, vol. 36, no.
6, pp. 910–916, Jun. 2001.
[14] Chan-Hong Park and Beomsup Kim, “A Low-Noise, 900-MHz VCO in 0.6-um CMOS,”
IEEE Journal of Solid-State Circuits, vol. 34, no. 5, pp. 586–591, May 1999.
[15] T. Kan, G. Leung, and H. Luong, “A 2-V 1.8-GHz Fully Integrated CMOS Dual-Loop
Frequency Synthesizer,” IEEE Journal of Solid-State Circuits, vol. 37, no. 8, pp. 1012–1020,
Aug. 2002.
[16] Yalcin Alper Eken, and John P. Uyemura, “A 5.9-GHz Voltage-Controlled Ring
Oscillator in 0.18-m CMOS,” IEEE Journal of Solid-State Circuits, vol. 36, no.1, pp. 230–233,
Jan. 2004.
[17] Matthew Z. Straayer, and Michael H. Perrott, “A Multi-Path Gated Ring Oscillator TDC
with First-Order Noise Shaping,” IEEE Journal of Solid-State Circuits, vol. 44, no. 4, pp. 230–
233, Apr. 2009.
124
[18] Junfeng Xu, Shwetabh Verma, Thomas H. Lee, “Coupled Inverter Ring I/Q Oscillator for
Low Power Frequency Synthesis,” in Proc. IEEE Symp. VLSI Circuits, 2006, pp. 172–173.
[21] Aminghasem Safarian, Seema Anand, and Payam Heydari, “On the Dynamics of
Regenerative Frequency Dividers,” IEEE Transactions on Circuits and Systems—II: Express
Briefs, vol. 53, No. 12, pp. 1413–1417, Dec. 2006.
[22] Shwetabh Verma, Hamid R. Rategh, and Thomas H. Lee, “A Unified Model for
Injection-Locked Frequency Dividers,” IEEE Journal of Solid-State Circuits, vol. 38, no. 6, pp.
1015–1027, Jun. 2003.
[24] Stefano Dal Toso, Andrea Bevilacqua, Marc Tiebout, Nicola Da Dalt, Andrea Gerosa,
and Andrea Neviani, “An Integrated Divide-by-Two Direct Injection-Locking Frequency Divider
for Bands S Through Ku,” IEEE Transactions on Microwave Theory and Techniques, vol. 58,
no. 7, pp. 1686–1695, July 2010.
125
[26] Jun-Chau Chien, and Liang-Hung Lu, “Analysis and Design of Wideband Injection-
Locked Ring Oscillators With Multiple-Input Injection,” IEEE Journal of Solid-State Circuits,
vol. 42, no. 9, pp. 1906–1915, Sep. 2007.
[27] Ahmad Mirzaei, Mohammad E. Heidari, Rahim Bagheri, and Asad A. Abidi, “Multi-
Phase Injection Widens Lock Range of Ring-Oscillator-Based Frequency Dividers,” IEEE
Journal of Solid-State Circuits, vol. 43, no. 3, pp. 656–671, Mar. 2008.
[31] B. Razavi, “A study of phase noise in CMOS oscillators,” IEEE Journal of Solid-State
Circuits, vol. 31, no. 3, pp. 331–343, Mar. 1996.
[32] A. Hajimiri, S. Limotyrakis, and T. H. Lee, “Jitter and phase noise in ring oscillators,”
IEEE Journal of Solid-State Circuits, vol. 34, no. 6, pp. 790–804, Jun. 1999.
[33] Alper Demir, A. Mehrotra, and J. Roychowdhury, “Phase noise in oscillators: a unifying
theory and numerical methods for characterization,” IEEE Transactions on Circuits and Systems
– I: Fundamental Theory and Applications, vol. 47, no. 5, pp. 655–673, May 2000.
[34] J. A. McNeill, “Jitter in ring oscillators,” IEEE Journal of Solid-State Circuits, vol. 32,
no. 6, pp. 870–879, Jun. 1997.
126
[35] L. Dai, and R. Harjani, “Design of low-phase-noise CMOS ring oscillators,” IEEE
Transactions on Circuits and Systems – II: Analog and Digital Signal Processing, vol. 49, no. 5,
pp. 328–338, May 2002.
[37] Asad. A. Abidi, “Phase noise and jitter in CMOS ring oscillators,” IEEE Journal of Solid-
State Circuits, vol. 41, no. 8, pp. 1803–1816, Aug. 2006.
[38] Zhiming Deng, and Ali M. Niknejad, “The Speed–Power Trade-Off in the Design of
CMOS True-Single-Phase-Clock Dividers,” IEEE Journal of Solid-State Circuits, vol. 45, no.
11, pp. 2457–2465, Nov. 2010.
[40] Payam Heydari, and Ravindran Mohanavelu, “A 40-GHz Flip-Flop Based Frequency
Dividers,” IEEE Transactions on Circuits and Systems–II: Express Briefs, vol. 53, no. 12, pp.
1358–1362, Dec. 2006.
[41] Christian Kromer, George von Büren, Gion Sialm, Thomas Morf, Frank Ellinger, and
Heinz Jäckel, “A 40-GHz Static Frequency Divider With Quadrature Outputs in 80-nm CMOS,”
IEEE Microwave and Wireless Components Letters, vol. 16, no. 10, pp. 564–566, Oct. 2006.
[42] Jaeha Kim, et al.,“Design Optimization of On-Chip Inductive Peaking Structures for
0.13-μm CMOS 40-Gb/s Transmitter Circuits,” IEEE Transactions on Circuits and Systems – I:
Regular Papers, vol. 56, no. 12, pp. 2544–2555, Dec. 2009.
[43] Sudip Shekhar, Jeffrey S. Walling, and David Allstot, “Bandwidth Extention Techniques
for CMOS Amplifiers,” IEEE Journal of Solid-State Circuits, vol. 41, no. 11, pp. 2424–2439,
Nov. 2006.
127
[44] Jeffrey S. Walling, Sudip Shekhar, and David Allstot, “Wideband CMOS Amplifier
Design: Time-Domain Considerations,” IEEE Transactions on Circuits and Systems – I: Regular
Papers, vol. 55, no. 7, pp. 1781–1793, Aug. 2008.
[45] Hai Tao et al., “40–43-Gb/s OC-768 16:1 MUX/CMU Chipset With SFI-5 Compliance,”
IEEE Journal of Solid-State Circuits, vol. 38, no. 12, pp. 2169–2180, Dec. 2003.
[48] Ming-Shuan Chen, C.-K. Ken Yang, “A Low-Power Highly Multiplexed Parallel PRBS
Generator”, accepted for publication at IEEE Custom Integrated Circuits Conference (CICC),
Sept. 2012.
[49] Ahmad Mirzaei, Mohammad E. Heidari, Rahim Bagheri, Saeed Cherazi, and Asad Abidi,
“The Quadrature LC Oscillator: A Complete Portrait Based on Injection Locking,” IEEE Journal
of Solid-State Circuits, vol. 42, no. 9, pp. 1916–1932, Sep. 2007.
[50] Emad Hegazi, and Asad A. Abidi, “Varactor Characteristics, Oscillator Tuning Curves,
and AM-FM Conversion,” IEEE Journal of Solid State Circuits, vol. 38, no. 6, pp. 1033-1039,
Jun. 2003.
[51] Behzad Razavi, Design of Analog CMOS Integrated Circuits, McGraw Hill, 2001.
[52] Dong-Keon Lee, Jeong-Kwang Lee, and Hang-Geun Jeong, “A dual Compensated
Charge Pump With Reduced Current Mismatch,” Proceedings of the 4th WSEAS international
conference on Circuits, systems, signal and telecommunications, pp. 109–112, 2010.
[53] Woogeun Rhee, “Design of High Performance CMOS Charge Pumps in Phase-Locked
Loops,” IEEE International Symposium on Circuits and Systems, pp. 545–548, Jul. 1999.
128
[54] Patrick Chiang, William J. Dally, Ming-Ju Edward Lee, Ramesh Senthinathan, Yangjin
Oh, and Mark A. Horowitz, “A 20-Gb/s 0.13-μm CMOS Serial Link Transmitter Using an LC-
PLL to Directly Drive the Output Multiplexer,” IEEE Journal of Solid-State Circuits, vol. 40, no.
4, pp. 1004–1011, Apr. 2005.
[55] C.-K. Ken Yang and Mark A. Horowitz, “A 0.8-pm CMOS 2.5 Gb/s Oversampling
Receiver and Transmitter for Serial Links,” IEEE Journal of Solid-State Circuits, vol. 31, no. 12,
pp. 2015–2023, Dec. 1996.
[56] Sudip Shekhar, et al., “Strong Injection Locking in Low-Q LC Oscillators: Modeling and
Application in a Forwarded-Clock I/O Receiver,” IEEE Transactions on Circuits and Systems-I:
Regular Papers, vol. 56, no. 8, pp. 1818–1829, Aug. 2009.
[57] Amr A. Hafez, C.-K. Ken Yang, “Design and Optimization of Multipath Ring
Oscillators,” IEEE Transactions on Circuits and Systems—I: Regular Papers, vol. 58, no. 10,
Oct. 2011, pp. 2332-2345.
[58] Amr A. Hafez, C.-K. Ken Yang, “Analysis and Design of Superharmonic Injection-
Locked Multipath Ring Oscillators,” Submitted to IEEE Transactions on Circuits and Systems—
I: Regular Papers.
[59] Amr A. Hafez, Ming-Shuan Chen, and C.-K. Ken Yang, “A Multi-Phase Multi-frequency
Clock Generator Using Superharmonic Injection Locked Multipath Ring Oscillators as
Frequency Dividers,” Accepted in IEEE Asian Solid State Circuits Conference (ASSCC), Nov.
2012.
[60] Ming-Shuan Chen, Amr A. Hafez, and C.-K. Ken Yang, “A 0.1-1.5 GHz 8-Bit Inverter-
Based Digital-to-Phase Converter Using Harmonic Rejection,” Accepted in IEEE Asian Solid
State Circuits Conference (ASSCC), Nov. 2012.
[61] Jaeha Kim, Brian S. Leibowitz, Metha Jeeradit, “Impulse Sensitivity Function Analysis
of Periodic Circuits,” IEEE International Conference on Computer-Aided Design (ICCAD), pp.
386–391, Nov 2008.
129