0% found this document useful (0 votes)
28 views

Design and Analysis of An Ultralow-Voltage Complementary Fold-Interleaved Multiple-Tail Current Mode Logic

Uploaded by

sonali
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views

Design and Analysis of An Ultralow-Voltage Complementary Fold-Interleaved Multiple-Tail Current Mode Logic

Uploaded by

sonali
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 31, NO.

11, NOVEMBER 2023 1675

Design and Analysis of an Ultralow-Voltage


Complementary Fold-Interleaved Multiple-Tail
Current Mode Logic
R. Malmir and M. B. Ghaznavi-Ghoushchi

Abstract— In this article, we proposed a new design called MFMCML Multifolded MOS current mode logic.
complementary fold-interleaved multiple-tail current mode logic DSCL Differential static CMOS logic.
(CFIMTCML) to implement logical functions with a fan-in DCVSL Differential cascade voltage switch logic.
higher than 2. This idea is implemented by alternately executing SDDSCL Shallow-depth differential SCL.
two steps. In the first step, a tail current is divided into multiple P-D Power and delay.
currents, but with a shallower depth from the ground to the
common-mode point. It is applied on all fully stacked stages. VOV Overdrive voltage.
The second step is implemented by alternating nMOS and pMOS τCA DelayCA .
differential pairs and utilizing current mirrors in adjacent logic PT Total power.
levels. The proposed approach allows a minimum power supply
equal to the conventional MCML inverter. Analytical details and
design procedures are presented. The method has been validated I. I NTRODUCTION
with post-layout simulations considering 180 nm CMOS technol-
ogy and supply voltage as low as 0.6 V. In particular, the SOP_X4
is implemented with the conventional SCL, MTCML, multifolded
MOS current mode logic (MFMCML), and CFIMTCML. Results
C MLS are attractive for wireless and wireline applications
at promising low voltages. In recent years, due to the
expansion of programs relying on digital signal processing,
show that the proposed logic demonstrates 90%, 20%, and
50% power delay product (PDP) reduction than conventional there has been a need for high-speed and high-resolution ICs.
SCL, MTCML, and MFMCML, respectively. Also, the results of For instance, video and audio signal processing, digital-to-
implementation and comparison of other gates, such as the carry analog converter, and analog-to-digital converter are target
generator and 8-bit carry generator demonstrate, at least about applications.
20% reduction of PDP. The delay increase rate at lower voltages
for the proposed gates is slower than the counterparts (15 ps at In these ICs, analog and digital circuits are embedded. The
1.8 V to 124 ps at 0.6 V for the proposed and 27 ps at 1.8 V to resolution of analog circuits is affected by the switching noise
253 ps at 0.6 V for MFMCML). This mitigated degradation is a of digital circuits. The switching noise of the digital block is
benefit of the proposed logic for low-noise/low-power applications transferred to the analog block through the substrate and power
demanding ultralow voltages. lines. In particular, due to the high noise levels, the traditional
Index Terms— Current mode logic (CML), multiple-folded CMOS logic is inappropriate for high-resolution ICs at low
MOS CML (MFMCML), multiple-tailed CML (MTCML), source voltages. Consequently, alternative logic styles with reduced
coupled logic (SCL). switching noise are needed.
One of the successful logic styles is SCL. SCL is an
N OMENCLATURE analog-like digital logic that brings the near-constant power
CML Current mode logic. of its analog operating like (1) into the digital working
CFIMTCML Complementary fold-interleaved of its functionality. The SCL is recognized by its constant
multiple-tail current mode logic. power consumption at a reasonable range of frequencies,
SCL Source coupled logic. high speed, low switching noise, and lower sensitivity to
SOP_X4 Sum of product as: process changes [1], [2], [3]. In mixed-signal circuits, low
OUT = (A × B) + (C × D). switching noise significantly reduces the digital noise induced
PDP Power delay product. in analog circuits [4], [5]. Considering that in the SCL, the
MTCML Multiple-tailed CML. switching noise is extremely low, SCL circuits are popular
and utilized in precision mixed-signal applications and high-
Manuscript received 2 April 2023; revised 17 July 2023; accepted 10 August speed systems [6]. With the recent advancements in technology
2023. Date of publication 29 August 2023; date of current version 24 October and increasing applications of mobile electronic devices, it is
2023. (Corresponding author: M. B. Ghaznavi-Ghoushchi.)
The authors are with the Department of Electrical Engineering, Shahed necessary to utilize circuits and systems with more fan-in.
University, Tehran 33191-18651, Iran (e-mail: [email protected]; On the other hand, in portable devices, it has been proven
[email protected]). that reducing the supply voltage leads to a considerable
Color versions of one or more figures in this article are available at
https://doi.org/10.1109/TVLSI.2023.3305915. reduction in power consumption. A decrease in the head-
Digital Object Identifier 10.1109/TVLSI.2023.3305915 room voltage reduces power dissipation and output logic
1063-8210 © 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.

Authorized licensed use limited to: Amrita School of Engineering. Downloaded on January 12,2024 at 10:06:21 UTC from IEEE Xplore. Restrictions apply.
1676 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 31, NO. 11, NOVEMBER 2023

a triple tail, but with an increase in the number of inputs,


it requires a larger supply voltage [16].
MFMCML structure utilizes logical blocks (each block has
a fixed number of stack levels) to design logic functions. The
greater number of fan-in in the MFMCML rather than stages
increases the number of blocks [17]. The MFMCML method
works with the lowest supply voltage equal to the voltage
required for a conventional SCL inverter. A flip-flop with the
minimum power daily product or a flip-flop with the highest
speed using MFMCML is reported in [18].
Fig. 1. (a) Power profile CMOS, CML, MTCML, and CFIMTCML near Despite the MFMCML operating with the lowest voltage,
the crossover frequency. (b) Symbols of current mirror [10]. (c) Other current
symbol. one block is added to the structure for each fan-in. For
instance, a simple five-input OR (OR_X5) requires five logic
blocks.
swin [7]. Due to the correlation among power supply voltage, In this article, we propose and comprehensively ana-
power consumption, and system speed, electronic systems are lyze the general scheme for implementing logic functions
dependent on the power supply voltage, which leads to a named CFIMTCML. Regardless of the number of fan-in, the
demand for low-voltage, high-performance design approaches. CFIMTCML works with a minimum supply voltage equal
A comparison of the four structures CMOS, SCL, MTCML, to a conventional inverter. The CFIMTCML operates with a
and MFMCML for different frequencies is shown in Fig. 1(a). minimum number of logic blocks, which enables it to function
It has shown that in approaches based on the SCL, the power at a minimum voltage and delay. The proposed logic utilizes
consumption is constant for variant frequencies a topology for low headroom voltages and results in ultralow-
voltage operation. This method reduces the logic delay while
POWER = VDD · ISS . (1) preserving the die area. Therefore, the proposed approach can
be utilized in applications with ultralow-voltage constraints,
The general structure of the conventional SCL logic is including mobile and portable devices.
shown in Fig. 2(a). The SCL approach was successfully It should be noted that in both the MFMCML structure and
applied to the MUX circuit with three fan-in [8]. In the the proposed logic, the current mirror is utilized abundantly.
conventional SCL structure, for each extra input, one level is Regarding the [10], the author adopted a specific symbol to
added to the stacked levels. With the contraction of technology show the current mirrors and to simplify the shape. These
and limited headroom voltage, it prohibits implementing gates symbols are presented in Fig. 1(b). Also, for the current
with more fan-in in SCL logic. With the reduction of the mirrors with one reference branch and two mirror branches,
operating supply voltage, this limitation has increased in the the symbols are shown in Fig. 1(c).
conventional SCL. For N -input in the SCL logic, there are The rest of the article is organized as follows. Section II
N + 2 stack levels. The complex gates and sub-systems explains the conventional SCL, MTCML, and MFMCML
inherently prevent the low-VDD design [9]. In recent years, logics. In Section III, the proposed approach is introduced
many efforts have been made to reduce the number of stack and fully elaborated. Section IV comes with a performed
levels in the SCL method. In the following sections, the implementation, simulation, and a detailed comparison of the
methods that can reduce the number of stages and headroom proposed method among recent works, and finally, conclusions
voltage will be discussed. In the digital scope, there is a are presented in Section V.
triple-tail technique to reduce the headroom voltage, presented
in [11]. Triple-tail was successfully applied to D-latch and II. CML S : SCL, MTCML, MFMTCML
Multiplexer in [12] and [13].
The triple-tail structure reduces one logic level of the stack. A. Conventional SCL
Although design with the triple tail method reduces one level The conventional SCL uses an nMOS network and a tail
of stacks, the advantage of this method disappears in the current source to implement the gates. To the implementation
performance of complex gates. Folding some of the stages of logic gates in this method, fan-in steers the current tail in
in the SCL structure reduces the number of stack levels. Con- a specific path toward one of two load resistors (two pMOS
sequently, the headroom voltage and power consumption of transistors operating in triode load). In the conventional SCL
the gate are mitigated. This technique was successfully imple- structure, for each extra input, one level is added to the stacked
mented in latches [14] and XOR, MUX, and LATCHES [15]. levels. This structure is shown in Fig. 2(a). For N -input in the
MTCML structure lowers the number of stack-level nMOS SCL logic, there are N + 2 stack levels (N for inputs, one for
by changing the structure of the nMOS network. MTCML the current mirror, and one for loud). In SCL, if the number
swaps and folds fully nMOS differential pairs (that not con- of fan-in increases, headroom voltage increases. Consequently,
nects to the output node) with pMOS differential pairs, and power consumption increases. It inhibits implementing gates
adds appropriate current sources, which creates logic gates with more fan-in in SCL because the supply voltage is limited
with lower supply voltage than the conventional SCL mode. (it has decreased to sub-1 volt in recent years). In the SCL
However, an MTCML works with a lower supply voltage than with a more considerable number of fan-in, the differential

Authorized licensed use limited to: Amrita School of Engineering. Downloaded on January 12,2024 at 10:06:21 UTC from IEEE Xplore. Restrictions apply.
MALMIR AND GHAZNAVI-GHOUSHCHI: DESIGN AND ANALYSIS OF AN ULTRALOW-VOLTAGE CFIMTCML 1677

Fig. 2. (a) Block scheme of a generic SCL gate with n levels. (b) SCL gates. (c) SCL SOP_X4 gate. (d) Dual of the SCL SOP_X4 gate.

pairs near the ground may enter the triode region, so level third logic level utilizes the nMOS differential pair again.
shifters are used to prevent this phenomenon [19]. Alternatively, if the first input is connected to the pMOS differ-
The MUX, XOR, and LATCH gates in SCL are shown ential pair, the second input is applied to the nMOS differential
in Fig. 2(b). The MUX, XOR, and LATCH gates have the pair. The current mirror is used to feed the current between the
same structure in the nMOS network. However, despite their logic levels. With these explanations, one can consider a block
shared structure, these gates have different connections to load for each fan-in. At the end of the MFMCML implementation
resistances (pMOS transistors in triode). As shown in Fig. 2(b), algorithm, each block with a differential pair implies three
we obtain a different gate if we change the connection levels of stacked overdrive headroom voltage. This voltage has
network. reached the minimum possible equal to the headroom voltage
Fig. 2(c) illustrates how conventional SCL logic is used to of an inverter in the SCL logic [17]. The general structure of
implement the SOP_X4 gate, which is specified as OUT = the MFMCML is shown in Fig. 4(a). The MFMCML SOP_X4
(A. B) + (C. D). gate structure is shown in Fig. 5. For the MFMCML, the
SCL-based techniques can be compared if the overdrive minimum voltage required is 0.6 V. By comparing the three
voltage is hypothetically assumed to be 0.2 V in 180 nm types of implementation (SCL, MTCML, and MFMCML),
technology. In SCL SOP_X4 gate implementation, it comes it is concluded that the MFMCML logic operates with the
with four inputs that require six logic levels of the stack and minimum possible voltage for any number of inputs.
a minimum of 1.2 V for the supply voltage. Although MFMCML logic has a minimum voltage head-
room among three methods (SCL, MTCML, and MFMCML),
B. Low-Voltage Shallow-Depth Structure it uses plenty of current mirrors that lead to reduced speed.
In MTCML, all transistors connected to the coupled source
of the differential pairs are folded, and appropriate current III. C OMPLEMENTARY F OLD I NTERLEAVED
tails are added at the folding points. The general structure M ULTIPLE -TAIL CML
of MTCML is shown in Fig. 3(a). Applying the MTCML
In the implementation, SCL and all methods based on SCL,
technique to the fully differential pairs, all the transistors
such as Triple-tail, MTCML, and MFMCML, utilize nMOS
in these logic levels are folded and replaced with pMOS
blocks, nMOS and pMOS blocks, and alternatively nMOS and
transistors. Therefore, the number of logic levels in the nMOS
pMOS blocks, respectively. Due to fan-in, the tail current is
network is reduced. The gate will operate with a lower supply
routed to the MOS load (the pMOS transistor is biased in the
voltage [16].
triode). The common theme of all these logics is current path
To design SOP_X4 with MTCML, it is first implemented in
selection and control.
SCL [i.e., Fig. 2(c)]. In the second step, the M1 transistor (this
The proposed logic (CFIMTCML) utilizes the folding tech-
transistor connects to the coupled source of the differential
nique and divides logic functions with a large-length depth
pair transistors M3 and M4) is folded. Finally, the appropriate
into smaller blocks with a minimum depth. The CFIMTCML
current source is embedded at the folded point. The SOP_X4
creates inverting and non-inverting functions with minimum
in MTCML is shown in Fig. 3(b). This gate requires a
depth.
minimum of 1 V headroom voltage because one of the stack
The CFIMTCML implementation is based on an algorithmic
levels is reduced. Although implementing the SOP_X4 gate in
approach. It is utilized in the implementation of relatively
MTCML requires a lower voltage than SCL, it is still higher
complex gates with high performance. Also, it is appropriate
than the supply voltage for a conventional inverter in SCL.
and efficient at low voltages.
Algorithm 1 shows the pseudo-code for the proposed logic
C. MFMCML Structure approach. It is worth mentioning that for further explanations
In MFMCML, if the first input is connected to the nMOS regarding executions 2–9, please refer [16], and for executions
differential pair, the second logic level is folded, and the nMOS 10–12, please refer to [17]. The general structure of the
differential pair is altered to the pMOS differential pair. The CFIMTCML is shown in Fig. 4(b).

Authorized licensed use limited to: Amrita School of Engineering. Downloaded on January 12,2024 at 10:06:21 UTC from IEEE Xplore. Restrictions apply.
1678 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 31, NO. 11, NOVEMBER 2023

Fig. 3. (a) Block scheme of a generic MTCML gate with n levels. (b) MTCML SOP_X4 gate. (c) Dual of the MTCML SOP_X4 gate. (d) Step 3 in the
CFIMTCML.

Fig. 4. (a) Block schematic of a generic MFMCML gate with n levels. (b) Block schematic of a generic proposed (CFIMTCML) gate with n levels.

Algorithm 1 Implementation of Input Logic Function in logic levels are transferred to the next block with the folding
CFIMTCML Logic technique. The current mirror is replaced to feed current from
the first level to the adjacent levels (lower levels that transfer to
the next block). This step is shown in Fig. 3(d). The second and
third steps are repeated until all the blocks have three stacked
levels. The SOP_X4 gate in CFIMTCML is shown in Fig. 5(b).
The significant differences between the implementation of
CFIMTCML and MFMCML are marked with gray blocks in
Fig. 5(a) and (b).
The delay analysis for a decoder with two levels is presented
in [20]. By generalizing this approach, it is easy to show
propagation delay of a gate with N levels of logic is equal
to
N
X
D N level = max Delaym . (2)
m=1

If we consider adjacent blocks in CFIMTCML and MFMCML


as stacked levels, we can utilize (2) for delay analysis. Due to
For instance, Algorithm 1 to implement the SOP_X4 gate (2), delay analysis for MFMCML and CFIMTCML is equal
is executed step-by-step. First, it is designed in the SCL to
[i.e., in Fig. 2(d)]. In the second step, it is folded the M1
transistor (this transistor connects to the coupled source of the D(SOPX4 ,MFMCML) = Dblock1 + Dblock2 + Dblock3 + Dblock4
differential pair transistors M3 and M4) and then substituted (3)
with the pMOS transistor. Then, two tail current sources
D(SOPX4 ,CFIMTCML) = Dblock1 + Dblock2 (4)
are added to the source of differential pair (M3-M4) and
single transistor M2 individually with the value of ISS/2 [i.e., where Dblock1 and Dblock4 in MFMCML are approximately
in Fig. 3(c)]. In the third step, the first level of the SOP_X4 equal to Dblock1 and Dblock2 in CFIMTCML, respectively,
is kept untouched (top stacked level), and the rest of the the only difference is that D_block1 and D_block2 have an

Authorized licensed use limited to: Amrita School of Engineering. Downloaded on January 12,2024 at 10:06:21 UTC from IEEE Xplore. Restrictions apply.
MALMIR AND GHAZNAVI-GHOUSHCHI: DESIGN AND ANALYSIS OF AN ULTRALOW-VOLTAGE CFIMTCML 1679

Fig. 5. (a) Conventional MFMCML SOP_X4 gate. (b) Proposed CFIMTCML SOP_X4 gate.

TABLE I In the first block, the pMOS transistors which feed the
C OMPARISON OF P OWER C ONSUMPTION AND D ELAY OF THE SOP_X4 pMOS differential pair, such as the pMOS in the current
G ATE IN SCL, MTCML, MFMCML, AND CFIMTCML S
(T ECHNOLOGY: 180 nm)
mirror [which generates current (ISS /2) in the SOP_X4 gate
shown in Fig. 5(b)], saturation conditions are guaranteed by
the following equation:
VDS,CM−P = VDD − VCM,M3,M4 − VGS,M3,M4 > VDS,SAT .


(6)
Finally, for nMOS transistors that feed nMOS differential
pair, such as nMOS transistors in nMOS current mirror
(CM-N) that are presented in Fig. 5(b), it is necessary to have
VDS,CM−N = VCM,M7,M8 − VGS,M7,M8 > VDS,SAT,CM−N . (7)


Moreover, the common-mode voltage of the nMOS differ-


ential pair and the pMOS differential pair is considered equal
to
VSW
additional parasitic capacitance of transistors M1 and M5, VCM,NMOS = VDD − (8)
8
respectively. By comparing relations (3) and (4), the propaga- VSW
tion delay of MFMCML has two terms (Dblock2 and Dblock3 ) VCM,PMOS = . (9)
8
more than the propagation delay of CFIMTCML. Conse-
quently, CFIMTCML is faster than MFMCML, confirmed by By using (8) and (9) in (6) and (7), we have (10) and (11),
the simulation results in Table I. respectively.
VSW
VDD > + |VGS,M3,M4 | + VDS,SAT (10)
8
A. Calculation of Common-Mode Voltage in the CFIMTCML VSW
VDD > + VGS,M7,M8 + VDS,SAT,CM−N . (11)
The output swing in the SCL circuits is equal to VSW = 8
2 × R L × VDD . In blocks of the CFIMTCML, either it has Hence, generalizing the above relationships, for a
nMOS source coupled pairs with pMOS in the triode region CFIMTCML gate with any fan-in value, the minimum supply
or it has pMOS source coupled pairs with nMOS in the voltage is equal to
triode region [Fig. 5(b)]. For this reason, CFIMTCML has
VSW
two output common-mode voltages. In the first case, the inputs VDD = + VTH + 2VOV . (12)
are connected to nMOS source coupled pairs, and the output 8
common-mode voltage equals VDD − (VSW /4). In the second The overdrive voltage equals VOV = VGS − VTH , and VTH
case, the inputs are connected to pMOS source coupled is the MOS threshold voltage. In conclusion, the minimum
pairs, and the output common-mode voltage equals (VSW /4). supply voltage for the CFIMTCML gate, regardless of the
In order to obtain the minimum voltage in CFIMTCML number of fan-in, is almost equal to that of a conventional
without losing generality, Fig. 5(b) is considered. The SOP_X4 SCL inverter.
designed in the CFIMTCML has two blocks. In the second For example, referring to the target standard 180 nm CMOS
block, for all transistors to operate in the saturation region, process with VTH in the range of 0.5–0.52 V, considering
it is required to have VSW = 0.4 V, designing VOV to be about 15 mV, (12) gives
  VDD,min,CFIMTCML in the range of 0.580–0.600 V, which allows
VSW using 0.6 V as nominal supply voltage.
− VCM,M7,M8 − VGS,M7,M8 .

VDS,M7,M8 = VDD −
8 The noise margin of the inverter implemented in SCL logic
(5) is calculated in [11]. Due to [11], the margin noise margin

Authorized licensed use limited to: Amrita School of Engineering. Downloaded on January 12,2024 at 10:06:21 UTC from IEEE Xplore. Restrictions apply.
1680 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 31, NO. 11, NOVEMBER 2023

The delay of the SOP_X4 gate is calculated in two cases.


In the first case, only fan-in is applied to the second block,
and the fan-in of the first block is considered fixed. In the first
case, the half-circuit model is shown in Fig. 8(a). In this case,
the circuit delay is equal to

τPD = 0.69 R p Cdb,Ml2 + Cdg,Ml2 + Cdb,M6 + Cdg,M6




+ Cdb,M8 + Cdg,M8 . (14)




In the second case, the delay of the last stage of the pMOS
and nMOS folded networks to the output node is considered.
The inputs are applied to the pMOS and nMOS transistors in
the first block, and all fan-in of nMOS and pMOS transistors
Fig. 6. Comparing the performance of the SCL, MTCML, MFMCML, and in the second block are constant. The small signal model of the
CFIMTCMLs in different headroom voltages. SOP_X4 for this case is shown in Fig. 8(b). The law of current
sources transformation is used to simplify the half-circuit of
SOP_X4. The circuit resulting from the simplification is shown
in Fig. 8(c). The half-circuit of the SOP_X4 in the second case
has six poles. Each pole value equals the total capacitances
multiplied by the total resistance seen from each node.
It is assumed that the B input is constant and the A input is
a pulse signal. The effective output capacitance (C L ′ ) is equal
to

C L ′ = Cdb,ML2 + Cdg,ML2 + Cgd,M6 + Cdb,M6


+ Cdg,M8 + Cdb,M8 . (15)

The effective CA capacitance is the sum of the drain-bulk,


Fig. 7. SOP_X4 gate P-D plots (per operation) of SCL, MTCML, MFMCML, drain-gate of M1 transistor and gate–source, and source-bulk
and CFIMTCML as a function of VDD . capacitors of M3 and M4 transistors

CA = Cdb,M1 + Cdg,M1 + Csb,M3 + Cgs,M3 + Csb,M4 + Csg,M4 .


of the SOP_X4 gate implemented in CFIMTCML is equal to (16)
NM = (Vswing /2).
Reducing the supply voltage reduces the power consumption The effective CB capacitance is the sum of the drain-bulk,
of the circuit. The minimum supply voltage depends on the drain-gate of M4 and M2 transistors, and the source gate of
headroom voltage [21], [22]. The P-D analysis of the SOP_X4 MM1 and MM2 transistors
for the variation of VDD from 1.8 to 0.6 V in 180 nm CMOS
standard technology is illustrated in Fig. 6. CB = Cdb,M4 + Cdg,M4 + Cdg,M2 + Cdb,M2
The SOP_X4 implemented in the SCL, MTCML, MFM- + Csg,MM1 + Csg,MM2 . (17)
CML, and CFIMTCML are compared in terms of power
The effective CC capacitance is the sum of the drain-bulk,
consumption and circuit speed. The results are presented in
drain-gate of the M5 and MM2 transistors, and source-bulk–
Fig. 7.
source-gate of M7 and M8 transistors

B. Power Consumption Modeling and Evaluation Cc = Cdb,M5 + Cdg,M5 + Cdg,MM2 + Cdb,MM2 + Csb,M7
+ Csg,M7 + Csb,M8 + Csg,M8 . (18)
To calculate the power consumption in each block, the total
current that leads from VDD to the ground is calculated and The Rcgd1 resistance at the drain-gate capacitor of the M1
multiplied by VDD. Consequently, in this method, to calculate transistor, and RCA , RCB , and RC,c resistors at the CA , CB , and
the power consumption of the entire gate, the power consump- Cc capacitors, respectively
tion of all blocks is added together. Therefore, the general  
power consumption formula is equal to 1 1
Rcgd1 = Ri + + × Ri × gmM1 (19)
X gmM4 gmM4
PT = VDD × ISS . (13) 1
RCA = (20)
gmM4
1
C. Delay of the CFIMTCML Circuit RCB = (21)
gmM1
It is easy to calculate the time constant and delay with the 1 1 1
RC,c = (22)
half-circuit small-signal model [23]. gmM8 gmM1 gmM2
Authorized licensed use limited to: Amrita School of Engineering. Downloaded on January 12,2024 at 10:06:21 UTC from IEEE Xplore. Restrictions apply.
MALMIR AND GHAZNAVI-GHOUSHCHI: DESIGN AND ANALYSIS OF AN ULTRALOW-VOLTAGE CFIMTCML 1681

Fig. 8. Half circuit model of proposed SOP_X4 CFIMTCML for delay calculation. (a) In first case. (b) In second case. (c) In second case with expanded
current source.

 
1 1
Rcgd,M2 = Rcc + + × Rcc × gmM2 (23)
gmM1 gmM1
RCl′ = R L = Rp. (24)

Finally, the delay of each node is equal to

τ cgd1 = 0.69 Rcgd1 × C cgd1


 
(25)
τCA = 0.69[RCA × CCA ] (26)
τc,B = 0.69 RC,M1 × CC,M1
 
(27)
τCc = 0.69[RCc × CCc ] (28)
τcgd,M2 = 0.69 Rcgd,M2 × Ccgd,M2
 
(29)
τ R L ′ = 0.69[R L ′ × C R L ′ ]. (30)

In the calculation of the total delay in SOP_X4, it is


required to consider both cases. First, it is the maximum delay
according to (14). By comparing (25)–(30), the largest one Fig. 9. SOP_X4 gate output in the SCL and the CFIMTCML.
is considered as the maximum delay of the SOP_X4 for the
second case discussed in Section III-C. The maximum delay
in the first and second cases are 122 and 118 ps, respectively. The comparisons of SOP_X4 among CFIMTCML and other
Therefore, the estimated propagation delay of 122 ps is very ones are concluded in Table I. According to the results,
close to the simulation delay (124 ps), which confirms the both MFMCML and CFIMTCML work correctly in headroom
analytical approach. voltage equal to 0.6 V. However, CFIMTCML has less power
consumption and a higher speed than MFMCML because
CFIMTCML utilizes fewer logic blocks and fewer current
IV. S IMULATION mirrors.
A. Simulation of the SOP_X4 in CFIMTCML According to Table I, the power consumption, propagation
delay, and PDP are reduced by about 0.80%, 50%, and 51%
To further show the advantages of the proposed for CFIMTCML versus MFMTCML. Hence, all the performed
CFIMTCML approach discussed in Section III and also to results confirm the advantages of the CFIMTCML approach.
compare this approach versus other ones, the SOP_X4 is The full-stacked folding technique is utilized in the imple-
implemented in the CFIMTCML and other methods. All of mentation of the LATCH, MUX, and XOR gates due to their
the simulations and comparisons are performed in CMOS inherent full-pair structures. The technique used in [15] is
180 nm technology. In all of the simulations, overdrive voltage similar to the MTCML method, but a current source equal to
is assumed 0.2 V to ensure on being in saturation. In the or greater than the tail current source is placed in the source
simulation of the SOP_X4 in SCL, which has six logic levels, of the folded transistors.
due to the number of logic levels and overdrive voltage, a 1.2 V In [14], the authors have implemented a latch controlled by
headroom voltage is required. The result of the simulation two consecutive clocks. This gate consists of an OR gate and a
SOP_X4 in the SCL is shown in Fig. 9. latch. To reduce the logical depth, the OR gate was completely
The simulation of the SOP_X4 gate in the MTCML with folded. In that case, the number of stacks was reduced to four
five logic levels requires 1 V headroom voltage. levels.
Simulating the SOP_X4 gate in MFMCML and
CFIMTCML with three logic levels requires only 0.6 V
headroom voltage. This headroom voltage is the minimum B. Simulation of Other Gates in CFIMTCML
voltage needed for a conventional inverter in SCL. The OR logic functions with FIN greater than two such as
simulation result of SOP_X4 with CFIMTCML is shown in OR_X3 (Fout = X0 + X1 + X2), OR_X4 (Fout = X0 +
Fig. 9. X1 + X2 + X3 ), and OR_X5 (Fout = X0 + X1 + X2 +

Authorized licensed use limited to: Amrita School of Engineering. Downloaded on January 12,2024 at 10:06:21 UTC from IEEE Xplore. Restrictions apply.
1682 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 31, NO. 11, NOVEMBER 2023

Fig. 10. (a) MFMCML 5-input OR gate (OR_5X). (b) CFIMTCML 5-input OR gate (OR_5X). (c) 8-bit carry generator chain structure, the basic block 8-bit
carry generator chain. (d) MFMCML and (e) CFIMTCML structure.

TABLE II TABLE III


C OMPARISON OF P OWER C ONSUMPTION AND D ELAY OF THE OR G ATE IN D ESIGN PARAMETER CFIMTCML @ C ORNER A NALYSIS AND T EMPERA -
MFMCML AND CFIMTCML S (T ECHNOLOGY: 45 AND 22 nm) TURE A NALYSIS

TABLE IV
X3 + X4 ) were considered for simulation. For instance, C ORNER /T EMPERATURE S IMULATION FOR C ARRY-O UT
IN CFIMTCML S TRUCTURE
OR_5X in the CFIMTCML structure and MFMCML logic are
presented in Fig. 10(a) and (b), respectively. The differences
between CFIMTCML and MFMCML are shown with colored
blocks in Fig. 10(a) and (b).
In [17], OR_3X, OR_4X, and OR_5X gates in MFMCML
logic using the 28 nm FD-SOI CMOS technology simulated,
and the PDPs obtained were 0.65, 1.5, and 2 fJ, respectively.
In particular, the simulation using the closest available
technology (the 22 nm PTM) was conducted. In this simu-
lation, OR_3X, OR_4X, and OR_5X gates in CFIMTCML
logic demonstrated PDP values of 0.2, 0.14, and 0.21 fJ, differences between CFIMTCML and MFMCML are shown
respectively. These results confirm the improved performance with gray blocks in Fig. 10(d) and (e).
of the proposed logic design. Finally, to further evaluate the CFIMTCML approach, prop-
To enable a more comprehensive comparison between the agation delay, and power consumption for the carry generator
MFMCML and the proposed logic, OR logic functions with in all corners and at five different temperatures have been
FIN greater than two have been designed and simulated using investigated. These gates are prepared in 180 nm CMOS
a PTM of 22 and 45 nm. The outcomes of these simulations standard technology, which main technology parameters are
are displayed in Table II. The results obtained from these reported in Table III.
simulations further corroborate the improved performance of The simulation results for various corners and temperatures
the proposed logic design. Additionally, all circuits have been are presented in Table IV.
developed with a supply voltage of 0.6 V to emphasize the When the temperature is swept between −20 ◦ C and 120 ◦ C,
proposed solution’s capacity to operate at very low voltages. the propagation delay ranges from 364.5 to 157 ps. It is
An 8-bit carry generator chain is shown in Fig. 10(c). The worth mentioning that the power consumption also varies by
basic block of this chain in the CFIMTCML structure and sweeping the temperature from 10.6 to 17.8 pW. By changing
MFMCML logic are presented in Fig. 10(d) and (e). The corners, the propagation delay ranges from 537 to 69 ps.

Authorized licensed use limited to: Amrita School of Engineering. Downloaded on January 12,2024 at 10:06:21 UTC from IEEE Xplore. Restrictions apply.
MALMIR AND GHAZNAVI-GHOUSHCHI: DESIGN AND ANALYSIS OF AN ULTRALOW-VOLTAGE CFIMTCML 1683

Fig. 11. (a) Cell-view of the CFIMTCML carry-out cell. (b) Post-layout for carry-out. (c) Cell-view of the CFIMTCML 8-bit carry generator chain cell.
(d) Post-layout for 8-bit carry generator chain cell.

Fig. 12. (a) Cell-view of the CFIMTCML SOP_X4 cell. (b) Post-layout for SOP_X4.

As shown in Table V, various headroom voltages are needed


because the number of logic levels required to implement carry
generators in different logics is varied. All the implementations
of the carry generator up to 0.9 V voltage are working
correctly. With the further reduction of the headroom voltage
due to the overdrive voltage, only two logics, MFMCML and
CFIMTCML, are working accurately.
To compare the carry generator implemented in
the CFITCML with SCL, DSCL [24], DCVSL [25],
MTCML [16], SDDSCL [22], and MFMCML [17] logics,
PDP has decreased by 80%, 26%, 30%, 21%, 21%, and 54%,
respectively.
Finally, an 8-bit carry generator with eight carry basic
blocks has been simulated and investigated in the CFIMTCML
and other presented logic. The simulation results of the 8-bit
carry generator are shown in Table V.
In this simulation, the 8-bit carry generator in the
CFIMTCML is investigated with other logic in terms of power
consumption, delay, and PDP.
Fig. 13. Monte Carlo simulation of CFIMTCML carry-out (a) power As can be seen in Table V, the PDP of the 8-bit carry
distribution and (b) delay distribution.
generator in the CFIMTCML is reduced by approximately
96%, 71%, 72%, 71%, 71%, and 40% compared to SCL,
DSCL [24], DCVSL [25], MTCML [16], SDDSCL [22], and
In the subsequent, the carry generator’s basic block is imple- MFMCML [17], respectively.
mented in the SCL, DSCL [24], DCVSL [25], MTCML [16], It should be noted that all of the results obtained in
SDDSCL [22], MFMCML [17], and CFIMTCML. The results the simulations and comparisons in the implementation of
are presented in Table V. SOP_X4, the basic block of the carry generator, and 8-bit

Authorized licensed use limited to: Amrita School of Engineering. Downloaded on January 12,2024 at 10:06:21 UTC from IEEE Xplore. Restrictions apply.
1684 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 31, NO. 11, NOVEMBER 2023

TABLE V
OVERHEAD AND P ERFORMANCE C OMPARATIVE FOR DSCL, SDDSCL, DCVS, MTCML, SCL, MFMCML, AND CFIMTCML E LEMENTS FOR C ARRY
G ENERATOR /8-bit C ARRY G ENERATOR (T ECHNOLOGY: 180 nm)

carry generator gates confirm the advantages of the proposed In addition, the proposed logic CFIMTCML demonstrates a
method. minimum of 20% reduction of PDP to other logics such as con-
ventional SCL, MTCML, MFMCML, and other approaches in
implementing the carry generator and 8-bit carry generator.
C. Functionality Analysis Based on the Layout
Fig. 11 shows the layout cell view and post-layout of the ACKNOWLEDGMENT
basic block of the carry generator and 8-bit carry generator The authors would like to thank Dr. Rafiee and Zanjani for
chain for the CFIMTCML. Also, Fig. 12 shows the layout helping in design improvement.
of the SOP_X4. These gates are prepared in 180 nm CMOS
standard technology. The areas of the proposed “SOP_X4,” R EFERENCES
“carry generator,” and “8-bit carry generator chain” circuits are [1] J. M. Musicer and J. Rabaey, “MOS current mode logic for low power,
241.4, 303.6, and 4716 µm2 , respectively. The Monte Carlo low noise CORDIC computation in mixed-signal environments,” in Proc.
simulation result of power distributions of the CFIMTCML ISLPED, 2000, pp. 102–107.
[2] I. Savidis, S. Kose, and E. G. Friedman, “Power noise in TSV-based
carry generator is illustrated in Fig. 13(a). 3-D integrated circuits,” IEEE J. Solid-State Circuits, vol. 48, no. 2,
Also, the Monte Carlo simulation result for the delay pp. 587–597, Feb. 2013.
distribution of the CFIMTCML carry generator is shown in [3] S. Badel, “MOS current-mode logic standard cells for high-speed low
Fig. 13(b). noise applications,” EPFL, Lausanne, Switzerland, Tech. Rep., 2008.
[4] M. Anis, M. Allam, and M. Elmasry, “Impact of technology scaling on
CMOS logic styles,” IEEE Trans. Circuits Syst. II, Analog Digit. Signal
Process., vol. 49, no. 8, pp. 577–588, Aug. 2002.
V. C ONCLUSION [5] D. J. Allstot, S. Kiaei, and R. H. Zele, “Analog logic techniques steer
around the noise,” IEEE Circuits Devices Mag., vol. 9, no. 5, pp. 18–21,
In this article, a new approach is presented that allows Sep. 1993.
MCML gates to be implemented with a fan-in higher than [6] F. Centurelli, G. Scotti, and G. Palumbo, “0.5-V frequency dividers in
two while keeping the minimum supply voltage as low as that folded MCML exploiting forward body bias: Analysis and comparison,”
Electronics, vol. 10, no. 12, p. 1383, Jun. 2021.
of a conventional SCL inverter. [7] A. Tajalli, E. J. Brauer, Y. Leblebici, and E. Vittoz, “Subthreshold
The CFIMTCML methodology, named complementary fold source-coupled logic circuits for ultra-low-power applications,” IEEE
interleaved multiple tail CML, was born out of a generaliza- J. Solid-State Circuits, vol. 43, no. 7, pp. 1699–1710, Jul. 2008.
[8] M. Alioto and G. Palumbo, Model and Design of Bipolar and MOS
tion and composition of the MTCML and MFMCML. The Current-Mode Logic: CML, ECL and SCL Digital Circuits. Springer,
CFIMTCML has dramatically reduced energy consumption by 2006.
using folding techniques and converting a big block (nMOS [9] M. Alioto and G. Palumbo, “Power-delay optimization of D-latch/MUX
source coupled logic gates,” Int. J. Circuit Theory Appl., vol. 33, no. 1,
network to implementation of complex function) into blocks pp. 65–86, Jan. 2005.
with a fixed number of levels. [10] S. Hemati, A. H. Banihashemi, and C. Plett, “A 0.18-µm CMOS analog
The proposed logic uses a topology for low headroom min-sum iterative decoder for a (32,8) low-density parity-check (LDPC)
voltages and results in ultralow-voltage operation. Therefore, code,” IEEE J. Solid-State Circuits, vol. 41, no. 11, pp. 2531–2540,
Nov. 2006.
the proposed approach can be utilized in ultralow-voltage [11] K. Gupta, N. Pandey, and M. Gupta, Model and Design of Improved
applications. Current Mode Logic Gates. Singapore: Springer, 2020.
Finally, PDP is significantly reduced. This approach is com- [12] K. Gupta, N. Pandey, and M. Gupta, “MCML D-latch using triple-tail
cells: Analysis and design,” Act. Passive Electron. Compon., vol. 2013,
pared with other methods presented by the authors. The results pp. 1–9, Nov. 2013.
show that the performance of the CFIMTCML is improved [13] K. Gupta, N. Pandey, and M. Gupta, “Low-voltage MOS current mode
compared to other methods. Analytical details and design logic multiplexer,” Radioeng. J., vol. 22, pp. 259–268, Apr. 2013.
[14] A. Tajalli and M. Atarodi, “Linear phase detection using two-phase
implementation approach are also presented. In particular, the latch,” Electron. Lett., vol. 39, no. 24, p. 1695, 2003.
post-layout simulation and schematic simulation on CMOS [15] S.-J. Song, S. M. Park, and H.-J. Yoo, “A 4-Gb/s CMOS clock and
standard 180 nm show that in the implementation of the data recovery circuit using 1/8-rate clock technique,” IEEE J. Solid-
Sum of Product (SOP_X4), the proposed logic demonstrates State Circuits, vol. 38, no. 7, pp. 1213–1219, Jul. 2003.
[16] M. B. Ghaznavi-Ghoushchi and S. A. H. Ejtahed, “MTCML: Analysis,
90%, 20%, and 50% PDP reduction than conventional SCL, design and optimization of an alternative shallow-depth multiple-tail
MTCML, and MFMCML, respectively. current mode logic,” Microelectron. J., vol. 67, pp. 57–70, Sep. 2017.

Authorized licensed use limited to: Amrita School of Engineering. Downloaded on January 12,2024 at 10:06:21 UTC from IEEE Xplore. Restrictions apply.
MALMIR AND GHAZNAVI-GHOUSHCHI: DESIGN AND ANALYSIS OF AN ULTRALOW-VOLTAGE CFIMTCML 1685

[17] G. Palumbo and G. Scotti, “A multi-folded MCML for ultra-low-voltage R. Malmir received the B.Sc. degree from the
high-performance in deeply scaled CMOS,” IEEE Trans. Circuits Syst. I, Sattari University, Tehran, Iran, in 2011, and the
Reg. Papers, vol. 67, no. 12, pp. 4696–4706, Dec. 2020. M.Sc. degree from the Shahed University, Tehran,
[18] F. Centurelli, G. Scotti, and G. Palumbo, “A very-low-voltage frequency in 2022, where he is currently working toward the
divider in folded MOS current mode logic with complementary n- and Ph.D. degree.
p-type flip-flops,” IEEE Trans. Very Large Scale Integr. (VLSI) Syst., His current research interests include the design
vol. 29, no. 5, pp. 998–1008, May 2021. of mixed-signal and ultralow-power circuits, the IoT,
[19] M. Alioto and G. Palumbo, “Feature-power-aware design techniques for and machine learning.
nanometer MOS current-mode logic gates: A design framework,” IEEE
Circuits Syst. Mag., vol. 6, no. 4, pp. 42–61, 4th Quart., 2006.
[20] S. M. Müller and W. J. Paul, Eds., The Complexity of Simple Computer
Architectures. Berlin, Germany: Springer, 1995.
[21] N. Lotze and Y. Manoli, “A 62 mV 0.13 µm CMOS standard-cell-based
design technique using Schmitt-trigger logic,” IEEE J. Solid-State Cir-
cuits, vol. 47, no. 1, pp. 47–60, Jan. 2012.
[22] M. Rafiee and M. B. Ghaznavi-Ghoushchi, “Design of low-voltage M. B. Ghaznavi-Ghoushchi received the B.Sc.
shallow-depth differential source coupled logic using feedback and feed- degree from Shiraz University, Shiraz, Iran, in 1993,
forward techniques,” Microelectron. J., vol. 86, pp. 140–149, Apr. 2019. and the M.Sc. and Ph.D. degrees from Tar-
[23] O. Musa and M. Shams, “An efficient delay model for MOS current- biat Modares University (TMU), Tehran, Iran, in
mode logic automated design and optimization,” IEEE Trans. Circuits 1997 and 2003, respectively.
Syst. I, Reg. Papers, vol. 57, no. 8, pp. 2041–2052, Aug. 2010. From 2003 to 2004, he was a Researcher at the
[24] M. E. S. Elrabaa, “A new static differential CMOS logic with superior TMU Institute of Information Technology. He is
low power performance,” Anal. Integr. Circuits Signal Process., vol. 43, currently an Associate Professor with Shahed Uni-
no. 2, pp. 183–190, May 2005. versity, Tehran. His interests include VLSI design,
[25] L. Heller, W. Griffin, J. Davis, and N. Thoma, “Cascode voltage low-power and energy-efficient circuit and systems,
switch logic: A differential CMOS logic family,” in IEEE Int. Solid- and computer aided design automation for mixed
State Circuits Conf. (ISSCC) Dig. Tech. Papers, Feb. 1984, pp. 16–17. signal and UML-based designs for SoC and Mixed-Signal.

Authorized licensed use limited to: Amrita School of Engineering. Downloaded on January 12,2024 at 10:06:21 UTC from IEEE Xplore. Restrictions apply.

You might also like