Efficient Computation of Quasi-Periodic Circuit Operating Conditions Via A Mixed Frequencytime Approach
Efficient Computation of Quasi-Periodic Circuit Operating Conditions Via A Mixed Frequencytime Approach
k1 ;1 kS =;1 kc =;1
Envelope
Sampled Envelope
X
K
1 KS
X 1
X
v(t) = V (k1 kS kc )
k1 =;K1 kS =;KS kc =1
e;|2k1 f1 t e;|2kS fS t e;|2kc fc t : (3)
where V (k1 kS kc ) 2 C N . An interesting property of the Figure 1: Sample envelope is the waveform traced out when signal
MFT method is that it is not necessary to truncate to a finite number is sampled with the clock period
of harmonics of fc .
Now suppose that v(t) is sampled at a discrete set of points
0
tn = t0 + nTc , where Tc = 1=fc is the clock period, t0 2 0Tc )
and n runs over the integers, to obtain a discrete signal v̄(t ). Since 0 which may be written more compactly by introducing the multi-
cycle transition function that is the collection of the K transition
functions from tk to tk + Tc , as
K KS
v̄(t0n ) = v̄Tc = ΦTc (v̄0):
X1 X
V̄ (k1 kS ) (8)
k1 =;K1 kS =;KS Now note that for each node n, the vector of signals on that
e;|2k1 f1 tn e;|2kS fS tn node, at the sample time plus one clock cycle, v̄ Tn , is a delayed
0 0
(4)
version of the signals at the sample points (this will be made more
clear below). Therefore there exists a linear operator DTc that maps
v̄0n to v̄Tnc
where
1
V (k1 kS kc )e;|2kc fc t v̄Tnc = DTc v̄0n :
X
V̄ (k1 kS ) = 0
(5) (9)
kc =;1 Note that DTc is a real matrix and independent of node n. Hence (9)
0 holds for each n = 1 N . It represents a boundary condition
the “envelope” v̄(t ) is S -quasiperiodic and can be represented as on solution to (1).
a Fourier series in only the “information” fundamentals. The clock
Combining (9) and (8) gives
fundamental has disappeared. For an example of such an envelope,
see Figure 1. The continuous waveform is the waveform that has (DTc IN )v̄0 ; ΦTc (v̄0) = 0 (10)
V̄ as its Fourier coefficients, or, equivalently, obtained by Fourier
interpolation of the sampled points. where is the Kronecker product (see [5] for definition) and I N
QS
In principle, since there are only K = s=1 (2Ks + 1) Fourier is the N by N identity matrix. Equation 10 is a system of KN
coefficients to represent v̄, then once the value of K distinct points nonlinear equations and KN unknowns v̄ 0 that can be solved for
t1 tK along the sample envelope are known, then the full the envelope sample points. From these sample points and the
transition functions the circuit’s quasiperiodic operating point (in
particular, the spectrum of v) can be recovered, which is discussed
envelope can be recovered. The envelope corresponding to the
quasiperiodic operating point is obtained by obtaining K sample
points that lie on the solution to the DAE given by (1). in Section 7.
We define the state transition function (v0 tk tf ) = v(tf ) :
v(t) satisfies equation (1) for t 2 tk tf ] and v(tk ) = v0: In 3 Sample point selection
particular, define the vector
To construct the matrix DTc , referred to as the delay matrix,
v̄0 = v̄T (t1 ) v̄T (tK )]T = vT (t1 ) vT (tK )]T (6) consider the Fourier series of v̄0 and v̄Tc . Referring to equation
(4), we have
where superscript T denotes matrix transpose, to contain v̄ at the
K sample points tk = t0 + nk Tc k = 1 : : : Knk 2 Z . The value K KS
v̄(t0n + Tc ) =
X1 X
of the K points that follow by one cycle can be obtained from the V̄ (k1 kS )
transition function, k1 =;K1 kS =;KS
e ; |2k1 f1 tn
e;|2kS fS tn ΩT c (k1 kS )
v̄TTc = vT (t1 + Tc ) vT (tK + Tc )]T
0 0
(7) (11)
T T
= (v (t1) t1 t1 + Tc ) (v (tK ) tK tK + Tc ) ]
T where
ΩTc (k1 kS ) = e ;|2k f Tc e;|2kS fS Tc
1 1
(12)
Thus if Γ is the matrix mapping sample points on the envelope Each iteration of GMRES requires a matrix-vector multiplication.
to Fourier coefficients, then the delay matrix may be constructed as For a vector q 2 <KN , the term (DTc IN )q is calculated by first
applying a K dimensional DFT N times, then scaling each row
DTc = Γ;1ΩTc Γ: (13) with ΩTc , and finally applying a K dimensional inverse DFT N
times.
In particular Γ may be constructed as the Kronecker product of Let q be partitioned into q = q1T qK
T ]T , qk 2 <N , for
one-dimensional 2K s + 1-point Fourier-transform matrices 1 k K . Then
Γ(mn
s) = e|2mfs tn =(2Ks +1) @1
2 3
(14)
@ v̄0 (t1 ) q1
@ Φ q = 64 .. 7
:
as
@ v̄ @K
. 5 (23)
(( DTc IN ) ; J )∆v̄0i = b 1
This follows because the eigenvalues of H
are typically inside the unit circle of
(21)
K
the complex plane. The delay matrix replicates the eigenvalue structure times, each
shift being a complex number of order unity, and generally causing the convex hull of
using the iterative solver GMRES [11], and setting the eigenvalues of D Tc ; J to enclose the origin.
v̄i+1
0
i
0
i
= v̄ + ∆v̄ : 0 (22)
Proof. For linear time-invariant circuits, the diagonal blocks of
@Φ @1 @
= = @ v̄ (tK ) = H . The Jacobian
2
(Γ
;1ΩTc Γ) IN ; (IK H ) (24)
;1 IN )(ΩTc IN )(Γ IN ) ; (IK H ) (25)
1
= (Γ
= (Γ
;1 IN )f(ΩTc IN ) ; (26)
0.5
(Γ
;1 ; 1 ;
IN ) (IK H )(Γ IN ) g(Γ IN )
1 0
= (Γ
;1 IN )f(ΩTc IN ) (27)
;
−0.5
= (Γ
;1 IN )f(ΩTc IN ) (28)
−1
= (Γ
;1 IN )f(ΩTc IN ) (29) −2
;(IK H )g(Γ IN ):
−1.5 −1 −0.5 0 0.5 1 1.5
Equations (24) to (25) is because of I N = IN IN IN and Lemma Figure 2: Eigenvalue distribution before and after preconditioning
(27) to (28) due to Lemma 5.2 (a). Since (Γ;1 IN ) is unitary and
5.1. Equations (26) to (27) is due to Lemma 5.2 (b), and equations
its inverse is (Γ IN );1 , the right hand side of equation (29) has the
same spectrum as (Ω Tc IN ) ; (IK H ). It is easy to verify that Preconditioned
Raw
k = 1 K .
The preceding analysis suggests a good way of preconditioning
||r ||/||r ||
−1
10
0
0 50 100 150 200 250 300 350 400 450 500 550
@
i = 1 K , of @ v̄0 . In particular, if the single-cycle state-
Φ
amplitudes are transformed into time domain initial conditions via −20
inverse Fourier transform. At higher input power levels, using a
Newton continuation [1] method, with the amplitude of the non-
−40
clock signals as the continuation parameter, is generally effective
dB
in securing convergence.
−60
= ( 1)
in t t
i i+Tc ], i = 1 K are available. From these pieces of
information, the spectrum v(t) can be obtained. Let
−120
−100 −50 0 50 100 150 200 250 300 350 400
Hz
Kc
X X1K KS
X
v(t) = V (k1 kS kc ) Figure 4: Harmonic distortion of a switched-capacitor filter
kc =;Kc k1 =;K1 kS =;KS
e;|2k1 f1 t e;|2kS fS t e;|2kc fc t : (31) 8 Test results
Define v̄( ) = v(t1 + )T v(t2 + )T v(tK + )T ]T . Then 8.1 Switched-capacitor Filter
v̄( ) = f(Γ;1 Ω( )) IN g (32) The first example is a low-pass switched-capacitor filter of 4kHz
2 3 bandwidth and having 238 nodes, resulting in 337 equations. To
..
. analyze this circuit, the MFT analysis was performed with an 8-
6 P
Kc ;|2kc fc 7
phase 100kHz clock and a 1V sinusoidal input at 100Hz.
6 kc =;Kc V (k1 kS kc )e :
7
4 5 The 1000 to 1 clock to signal ratio makes this circuit difficult
..
. for traditional circuit simulators to analyze. In the MFT method,
three harmonics were used to model the input signal. The eight-
Then for each KN -vector V ( kc ), where ;Kc kc Kc, phase clock resulted in the need to use about 1250 timepoints
N -vectors V (k1 kS kc ), where
which is collection of all in each transient integration. This brings the total number of
;K1 k1 K1 ;KS kS KS (the actual order is variables solved by the analysis to slightly less than three million
determined by the Fourier transform), (337 (2 3 + 1) 1250 = 2 948 750). The simulation took a
little less than 20 minutes CPU time to finish, on a Sun UltraSparc1
Z Tc
f(Ω( );1Γ) IN gv̄( )e|2kc fc d
workstation with 128 Megabyte memory and a 167MHz CPU clock.
V (kc ) = T1 (33) Figure 4 shows the output spectrum of the filter.
c 0
Forming f(Ω( ) ;1Γ) IN gv̄( ) requires the values for v(t1 + 8.2 High-performance receiver
) v(tK + ), or synchronized time steps between cycles. The The second example is the high-performance image rejection
total cost is one KN -vector integration and M Fourier transforms, receiver also discussed in [14]. It consists of a low-noise amplifier, a
where M is the number of synchronized time points. splitting network, two double-balanced mixers, and two broad-band
The synchronized time step requirement may not be easily Hilbert transform ouput filters combined with a summing network
met in practice. One alternative is to use interpolation schemes. that is used to supress the undesired side-band. A limiter in the LO
However they potentially lose accuracy. Another alternative is to path is used for controlling the amplitude of the LO. It is a rather
trade integrations for Fourier transforms. Specifically, it is easy to large RF circuit that contains 167 bipolar transistors and uses 378
verify that nodes. This circuit generated 987 equations in the simulator.
To determine the intermodulation distortion characteristics, the
V (k1 kS kc ) (34) circuit was driven by a 780MHz LO and two 50mV closely placed
Z Tc
EpT f(Ω( );1 Γ) IN gv̄( )e|2kc fc d
1 RF inputs, at 840MHz and 840MHz+10KHz, respectively. Three
=
Tc 0
harmonics were used to model each of the RF signals. 200
time points were used in each transient clock-cycle integration,
Z Tc S P
= EpT (Γ IN )( T1c v̄( )e|2( i= ki fi ) e|2kc fc d )
1
considered to be conservative in terms of accuracy for this circuit.
As a consequence, nearly ten million unknowns (987 (2 3 +
1)2 200 = 9 672 600 ) were generated. It took 55 CPU minutes
0
where Ep is a KN N block matrix whose pth N N block is I N to finish on a Sun UltraSparc10 workstation with 128 Megabytes of
and other blocks zero, and p is determined by (k 1 kS ) from the physical memory and a 300MHz CPU clock. Figure 5 shows 3rd
Fourier transform. Calculating (34) does not require synchronized and 5th order distortion products.
time points. The total cost of calculating V ( kc ) is K KN -vector To understand the efficiency of the MFT method, consider that
integrations plus one final Fourier transform. However, it might be traditional transient analysis would need at least 80,000 cycles of
more expensive since integrations normally cost more than Fourier the LO to compute the distortion, a simulation time of over two
transforms. days. Additionally, the results would be very inaccurate, because
0
References
−10
[1] A. ALLGOWER AND K. GEORG, Numerical Continuation
Methods, Springer-Verlag, New York, 1990.
−20
[2] L. O. CHUA AND A. USHIDA, Algorithms for computing almost
−30 periodic steady-state response of nonlinear systems to multiple
input frequencies, IEEE Trans. Circuits and Systems, 28
(1981), pp. 953–971.
dB
−40
−80
[4] K. S. KUNDERT, J. K. WHITE, AND A. SANGIOVANNI-
59.97 59.98 59.99 60
MHz
60.01 60.02 60.03 60.04
VINCENTELLI, Steady-State Methods for Simulating Ana-
log And Microwave Circuits, Kluwer Academic Publishers,
Boston, 1990.
Figure 5: Intermodulation distortion of a high-performance receiver
[5] P. LANCASTER AND M. TISMENETSKY, The Theory of Matrices,
Academic Press, second ed., 1985.
of the large amount of numerical error accumulated by integrating [6] D. LONG, R. MELVILLE, K. ASHBY, AND B. HORTON, Full chip
over so many cycles. In contrast, the MFT method is able to resolve harmonic balance, in Proceedings of the Custom Integrated
very small signal levels, such as the 5th order distortion products Circuits Conference, May 1997.
show in Figure 5.
Solving the MFT equations by direct factorization methods is [7] R. MELVILLE, P. FELDMANN, AND J. ROYCHOWDHURY, Effi-
also impractical, as the storage needed for the factored rank-50,000 cient multi-tone distortion analysis of analog integrated cir-
(987 (2 3 + 1)2 = 48 363 ) MFT Jacobian of Equation 19 is cuits, in Proceedings of the Custom Integrated Circuits Con-
several gigabytes. Forming the Jacobian matrix by direct methods ference, May 1995.
would also require computation time proportional to the cost of
50,000 transient integrations, again a number on the order of days. [8] M. OKUMURA, T. SUGAWARA, AND H. TANIMOTO, An effi-
cient small signal frequency analysis method for nonlinear
9 Conclusion circuits with two frequency excitations, IEEE transactions of
computer-aided design of integrated circuits and systems, 9
(1990), pp. 225–235.
In this paper we have demonstrated that the MFT method is an
efficient approach to analyzing multi-frequency nonlinear effects [9] J. ROYCHOWDHURY, Efficient methods for simulating highly
such as intermodulation distortion. Making the MFT method nonlinear multirate circuits, in Proceedings of the 34 th Design
computationally efficient on problems of engineering interest Automation Conference, Anaheim, CA, June 1997, pp. 269–
required careful construction of the delay matrix, matrix-implicit 274.
Krylov subspace iterative linear solvers, and a preconditioner
tailored to the MFT method and the circuits it typically analyzes. As [10] J. ROYCHOWDHURY, D. LONG, AND P. FELDMANN, Cyclo-
a result, nonlinear systems comprising tens of millions of unknowns stationary noise analysis of large RF circuits with multitone
can be solved in less than an hour with computational resources excitations, IEEE J. Sol. St. Circuits, 33 (1998), pp. 324–336.
commonly available to engineering designers. [11] Y. SAAD AND M. H. SCHULTZ, GMRES: A generalized minimal
One salient advantage of the MFT method as described here residual algorithm for solving nonsymmetric linear systems,
is that the dominant part of the computation is in computing
the functions and the product of the Jacobian of with some
SIAM J. Sci. Stat. Comput., 7 (1986), pp. 856–869.
vector. Both computations are essentially the solution of an initial [12] R. TELICHEVESKY, K. S. KUNDERT, AND J. K. WHITE, Efficient
value problem. Each application of the operator D Tc ; J , or steady-state analysis based on matrix-free Krylov-subspace
calculation of the Newton residual, involves solving K such initial methods, in Proceedings of the 1995 Design Automation
value problems, that is, integrating K sets of DAEs forward in Conference, June 1995.
time over one clock period. Each of the K problems, however, is
essentially decoupled. Parallel implementations of the MFT will [13] , Efficient AC and noise analysis of two-tone RF circuits,
therefore enjoy very efficient processor utilization. This decoupling in Proceedings of the 1996 Design Automation Conference,
also assists the implementation of out-of-core solvers. In fact we June 1996.
have observed it is possible to implement the MFT algorithm as an
[14] R. TELICHEVESKY, J. WHITE, AND K. KUNDERT, Receiver
out-of-core algorithm with over 80% average CPU utilization.
Several possible extensions of this work are possible. Of characterization using periodic small-signal analysis, in
particular interest is the computation of poly-cyclostationary[10] Proceedings of the Custom Integrated Circuits Conference,
noise statistics that can be performed by linearizing around the the May 1996.
quasiperiodic steady-state. [15] Y. THODESEN, Two stage method for efficient simulation of
parametric circuits, PhD thesis, Department of telecommuni-
cations, the Norwegian institute of technology, 1996.