0% found this document useful (0 votes)
13 views

Differential_ReadWrite_7T_SRAM_With_Bit-Interleave

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

Differential_ReadWrite_7T_SRAM_With_Bit-Interleave

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Received April 6, 2021, accepted April 21, 2021, date of publication April 26, 2021, date of current version

May 4, 2021.
Digital Object Identifier 10.1109/ACCESS.2021.3075460

Differential Read/Write 7T SRAM With


Bit-Interleaved Structure for
Near-Threshold Operation
JI SANG OH 1 , (Member, IEEE), JUHYUN PARK 1,2 , (Member, IEEE),
KEONHEE CHO 1 , (Member, IEEE), TAE WOO OH 1 , (Member, IEEE),
AND SEONG-OOK JUNG 1 , (Senior Member, IEEE)
1 School of Electrical and Electronics Engineering, Yonsei University, Seoul 03722, South Korea
2 DRAM Development Division, SK Hynix Inc., Icheon-si 467866, South Korea
Corresponding author: Seong-Ook Jung ([email protected])
This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) under
Grant 2021R1A2C2008297.

ABSTRACT Near-threshold voltage (Vth ) operation is an effective method for lowering energy consumption.
However, it increases the impact of Vth variation significantly, which makes it difficult for previously
proposed static random access memory (SRAM) bitcells to achieve high read stability and write ability
yields. To achieve these in the near-Vth region, a differential 7T SRAM bitcell is proposed in which an
additional row-based control signal and an nMOS transistor between the pull-up and pull-down transistors
is adopted on one side of the cross-coupled inverter. In addition, the proposed SRAM bitcell can use
a bit-interleaved structure without the half-select issue. Compared to differential 10T and 12T SRAM,
the proposed differential 7T SRAM achieves 5% and 6% higher SRAM operating frequency and 70% and
23% lower operation energy consumption with a 33% and 49% smaller bitcell area, respectively.

INDEX TERMS 7T bitcell, half-select issue, low energy consumption, near-threshold voltage, static random
access memory (SRAM).

I. INTRODUCTION In the non-bit-interleaved structure shown in Fig. 1(a),


Recently, the demand for a low-energy system on chip (SoC) if multi-bit errors occur in a word, many parity bits and
for application in the Internet of Things (IoT), biomedi- a complex error correct code (ECC) circuit are required to
cal implants, and energy harvesting devices has increased correct the errors, which causes a large area penalty and high
significantly. The most efficient method to reduce energy energy consumption. Therefore, to prevent multi-bit errors in
consumption is to decrease the supply voltage (VDD ). This a word, the bit-interleaved structure shown in Fig. 1(b) [4]
is because it exerts quadratic and exponential impacts on needs to be used.
the dynamic and standby energy consumptions, respectively. The other problem is that the circuit operation yield
However, decreasing VDD to the sub-threshold voltage degrades because of the large influence of Vth variation.
(sub-Vth ) region degrades circuit performance by at least Among the various modules in an SoC, the yield of the static
three orders of magnitude from that in the super-Vth region. random access memory (SRAM) is significantly degraded
In contrast, in the near-Vth region, the performance degra- by Vth variation because the SRAM bitcell consists of small
dation is by one or two orders of magnitude from that in transistors for achieving high-density integration. In addition,
the super-Vth region. Thus, energy consumption and circuit because the conventional six-transistor (6T) SRAM bitcell
performance can be appropriately balanced in the near-Vth has a trade-off between the read stability and write ability
region [1], [2]. yields, the target yield cannot be achieved in the read and
However, there are several problems with near-Vth opera- write operations simultaneously in the near-Vth region [5].
tion. First, the critical charge for causing soft errors decreases, Various SRAM bitcell structures have been proposed to
thereby making the SRAM bitcell vulnerable to them [3]. eliminate the trade-off between the read stability and write
ability yields. However, bitcells have some drawbacks, such
The associate editor coordinating the review of this manuscript and as the half-select issue, large area overhead, high energy
approving it for publication was Khursheed Aurangzeb. consumption, and performance degradation.

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
VOLUME 9, 2021 64105
J. S. Oh et al.: Differential Read/Write 7T SRAM With Bit-Interleaved Structure for Near-Threshold Operation

FIGURE 1. The (a) non-bit-interleaved and (b) 2:1 bit-interleaved SRAM structures.

In this paper, the differential 7T SRAM bitcell is proposed


to address these issues by incorporating a properly controlled
nMOS transistor between the pull-up and pull-down transis-
tors on the same side in a conventional 6T SRAM bitcell. This
paper is organized as follows. In Section II, the previously
proposed SRAM bitcells are overviewed, while Section III
covers the proposed differential 7T SRAM bitcell’s structure
and operation. Section IV provides simulation results, and
conclusions are presented in Section V.

II. PREVIOUSLY PROPOSED SRAM BITCELLS


Various SRAM bitcells have been proposed for operation in
the near-Vth region. They can be classified as single-ended
and differential SRAM bitcells depending on how the stored
data is sensed (Fig. 2: single-ended 7T-1 [6], 7T-2 [7], and
8T [8] SRAM bitcells and Fig. 3: differential 10T [10],
12T [11], and P-P-N 10T [12] SRAM bitcells). FIGURE 2. Single-ended (a) 8T, (b) 7T-1, and (c) 7T-2 SRAM bitcells.

A. SINGLE-ENDED SRAM BITCELLS


The 7T-1 SRAM bitcell has improved read stability achieved
by cutting off the positive feedback in the cross-coupled
inverters, while the single-ended 7T-2 and 8T SRAM bitcells
have improved read stability due to decoupling of the data
node from the additional read bitline (RBL).
However, during the write operation, the row half-selected
bitcells (RHSCs) in the 7T-1, 7T-2, and 8T SRAM bitcells
suffer from read disturbance, which can be resolved using
the write-back scheme [9]. However, this requires a sensing
circuit and write driver at each column, which incurs a sub-
stantially larger area penalty and write energy consumption
overhead.
Single-ended SRAM bitcells have the advantage of saving
the read energy because the RBL is discharged with a prob-
FIGURE 3. (a) Differential 10T, (b) 12T, and (c) P-P-N 10T SRAM bitcells.
ability of 50% depending on the stored data. However, they
have a long read delay because the inverter sense amplifier
needs a large voltage swing for data sensing. Therefore, half-select issue. A pass gate controlled by the column-based
the advantage of saving the BL discharge energy can be WWL was added to the differential 10T and 12T SRAM
canceled out by the increase in static energy due to the long bitcells, while pseudo-data nodes pQ and pQb were used
read delay. in the P-P-N 10T SRAM bitcell to prevent data loss in
the RHSCs. These bitcells have a shorter read delay than
B. DIFFERENTIAL SRAM BITCELLS single-ended SRAM bitcells because the differential voltage
Differential 10T, 12T, and P-P-N 10T SRAM bitcells without latch sense amplifier (VLSA) needs a small voltage swing for
the write-back scheme have been proposed to overcome the data sensing. However, the increased number of transistors

64106 VOLUME 9, 2021


J. S. Oh et al.: Differential Read/Write 7T SRAM With Bit-Interleaved Structure for Near-Threshold Operation

FIGURE 5. The selected bitcell and RHSC during the write ‘‘1’’ operation.

FIGURE 6. Waveforms in the RHSCs of the proposed differential 7T SRAM


bitcell with Q stored as ‘‘0’’ during a write operation according to TW _P1 .
Data in the RHSCs (a) is flipped with a short TW _P1 (b) but stably
maintained with a sufficiently long TW _P1 .

FIGURE 4. (a) The proposed differential 7T SRAM bitcell structure:


waveforms during the (b) read ‘‘0’’ and (c) read ‘‘1’’ operations.

incurs a large bitcell area. Moreover, these bitcells have a high


read energy consumption due to high BL capacitance caused
by the increased bitcell height. Therefore, a new SRAM FIGURE 7. The write ‘‘1’’ operation in the proposed differential 7T SRAM
bitcell is required to resolve the half-select issue with a fast bitcell according to TW _P2 . The write operation (a) fails when TW _P2 is
too short (b) but succeeds when TW _P2 is sufficiently long.
delay, a small area, and low energy consumption.

III. PROPOSED DIFFERENTIAL 7T SRAM BITCELL BLB are precharged to VDD , as in the conventional 6T SRAM
Fig. 4(a) shows the structure of the proposed differential bitcell. Afterward, the read operation begins by increasing the
7T SRAM bitcell. Compared to the conventional 6T SRAM WL and decreasing the WLRB. Accordingly, BL or BLB is
bitcell, it has an additional transistor (PMR) driven by a discharged when the Q node stores ‘‘0’’ (read ‘‘0’’) or ‘‘1’’
word-line right bar (WLRB); word-line (WL) and WLRB (read ‘‘1’’), respectively. In the conventional 6T SRAM bit-
are row-based signals. Prior to the read and write operations, cell, the charge injected from the BL or BLB is likely to cause
WL, WLRB, and BL/BLB are set to VSS , VDD , and VDD , data flip during the read operation, which degrades the read
respectively. stability. Meanwhile, in the proposed differential 7T SRAM
bitcell, this data loss risk is mitigated by turning off the PMR
A. READ OPERATION (WLRB = VSS ) during the read operation, which improves
Fig. 4(b) and 4(c) show the read ‘‘0’’ and read ‘‘1’’ operations the read stability for both read ‘‘0’’ and ‘‘1’’ operations. For
and their waveforms, respectively. Before the read operation a read ‘‘0’’ operation, the read current flows through the PGL
starts in the proposed differential 7T SRAM bitcell, BL and and PDL. This causes an increase in the data Q node voltage

VOLUME 9, 2021 64107


J. S. Oh et al.: Differential Read/Write 7T SRAM With Bit-Interleaved Structure for Near-Threshold Operation

FIGURE 8. WL/WLRB (a) generator and (b) waveforms of the proposed differential 7T SRAM including a replica BL.

(disturbance from BL). However, this cannot flip the Qb node to be discharged before the WLRB is reset. This causes a
because its pull-down path is disconnected by turning off smaller disturbance, as shown in Fig. 6(b).
the PMR. Therefore, when the read operation is finished, the Another timing constraint that needs to be considered is the
increased Q node voltage can be completely recovered to ‘‘0.’’ delay of TW _P2 , which is the duration when WL and WLRB
In contrast, in the case of a read ‘‘1’’ operation, the read are simultaneously high after the WLRB is reset. If TW _P2 is
current flows through PGR and PDR, which increases the too short, the write operation fails because the time to flip the
M node voltage (disturbance from BLB). This disturbance data is insufficient, as shown in Fig. 7(a). Thus, the WL needs
cannot affect the Qb node because it is isolated from the M to be low after a sufficiently long TW _P2 after the WLRB is
node by turning off the PMR. Although the Qb node floats, its reset, as shown in Fig. 7(b).
voltage does not become high enough to flip the data Q node. As TW _P1 (TW _P2 ) increases, the read stability yield of
This is because it remains at a lower voltage than VSS owing to the RHSCs (the write ability yield of the selected bitcells)
the coupling caused by the signal transition of WLRB from is improved. However, the write delay increases. Thus, this
VDD to VSS . Therefore, the proposed differential 7T SRAM influence must be considered when the SRAM operating
bitcell can endure read disturbance. frequency is determined.
Vth in the saturation and linear modes were measured as
VGS when IDS per effective width was 10−5 A/µm, with
B. WRITE OPERATION
|VDS | = VDD and |VDS | = 0.05 V [17]. The sub-threshold
The write operation is classified into write ‘‘0’’ (VBL = VSS swing was measured as VGS_10times − 0.05 V. VGS_10times is
and VBLB = VDD ) and write ‘‘1’’ (VBL = VDD and VBLB = VGS when IDS is 10 times that at VGS = 0.05 V.
VSS ) operations. A write operation starts when WL increases
and WLRB decreases. Although the proposed 7T SRAM
bitcell has a differential BL pair, it performs a single-ended C. WL/WLRB GENERATOR
write operation because the Qb node is not connected to Fig. 8(a) and 8(b) show the WL/WLRB generator and its
the BLB by the turned-off PMR transistor. The single-ended waveforms, respectively. Commonly, a replica BL and delay
write operation exhibits a lower write ability yield than a lines generate the sense amplifier enable (SAE) and disable
differential write operation, particularly when the Q node the WL-enable-signal (WLEN ). To generate TW _P1 and TW _P2
needs to be altered from ‘‘0’’ to ‘‘1’’ during the write ‘‘1’’ for the proposed bitcell, an even number (NTWP1 ) and odd
operation. This is because the PGL nMOS pass gate cannot number (NTWP2 ) of delay line stages are used, respectively.
deliver a full ‘‘1’’ to the Q node. To prevent a single-ended The several delay line stages of NTWP1 are shared with the
write operation, the WLRB is reset to VDD after a delay of path for generating an SAE.
TW _P1 from enabling WL. In this regard, TW _P1 should be The replica BL node (BLRep ) is precharged to VDD prior
carefully determined because the WLRB signal affects the to enabling the WL. After the WL is enabled, the BLRep
RHSCs, as shown in Figs. 5 and 6. If TW _P2 is too short, is discharged. Subsequently, OUTNTWP1 is decreased during
data in the RHSCs can be flipped because a high BL or the write operation (WT = 1), which disables the WLRB.
BLB voltage level in the unselected columns causes a large In contrast, during a read operation (WT = 0), OUTNTWP1
disturbance, as shown in Fig. 6(a). Thus, TW_P1 should be is always VDD regardless of BLRep . Thus, WLRB is not reset
sufficiently long for the BL or BLB of the unselected column before WL is disabled. In this manner, unnecessary toggling

64108 VOLUME 9, 2021


J. S. Oh et al.: Differential Read/Write 7T SRAM With Bit-Interleaved Structure for Near-Threshold Operation

TABLE 1. Model characteristics for VDD = 0.8 V.

FIGURE 9. Layout for (a) a single bitcell and (b) the 2 × 2 array in the
proposed differential 7T SRAM based on 22-nm FinFET technology.

of the internal nodes is circumvented during read operations


to reduce the power consumption.

D. BITCELL LAYOUT
Fig. 9(a) and 9(b) show the proposed differential 7T SRAM
bitcell and 2 × 2 array layouts based on 22-nm FinFET
technology, respectively. The local interconnects (L1 and L2) FIGURE 10. Bitcell area comparison.
in the middle of the layer are applied to reduce the number
of metal layers [13]. A dummy poly gate is incorporated for
the regular pitch of the poly gate. The BL/BLB, VDD , and VSS
are routed using metal 2, the WL and WLRB are routed using gate length variation. The effective width is the sum of the fin
metal 3, and the inner routing of the bitcell is achieved using thickness and twice the fin height.
metal 1. To improve the accuracy of the simulation result, wire
parasitic resistance (R) and capacitance (C) were modeled
IV. SIMULATION RESULTS AND COMPARISON by using the π-RC wire model based on R per length
The proposed differential 7T SRAM bitcell was verified of 21 ohm/µm and C per length of 0.16 fF/µm, as reported
via an HSPICE Monte Carlo simulation using a 22-nm in [19]. When the performance was compared between the
BSIM-CMG FinFET model [14]. The characteristics of this proposed differential 7T SRAM bitcell and previously pro-
model were fitted to those of a commercial low-power device posed SRAM bitcells, the one located farthest away from the
model based on the measured data for 22-nm FinFET sili- peripheral circuit was considered as the worst performer.
con [15]. Moreover, the parasitic capacitances were fitted to The read static noise margin [20] and WL write trip volt-
the TCAD simulation results for the 22-nm FinFET [16]. The age [21] are commonly used metrics to measure the read
model characteristics are listed in Table 1. It was assumed stability and write ability yields, respectively. However, these
that the Vth variation in each transistor follows a Gaussian static metrics are unsuitable for the proposed differential 7T
distribution [17] with a standard deviation (σVth ) expressed SRAM bitcell because they cannot consider the floating node
as and WLRB pulse width [22], [23]. In this study, counting
read and write failure samples in the Monte Carlo simulation
AVt
σVth = √ (1) results were used to consider the floating node and WLRB
Length × Effective Width pulse width, and importance sampling based on [24] was used
where an AVt of 1.8 mV · µm was used for the FinFET with to reduce the number of samples in the transient Monte Carlo
a lightly doped channel [18] to consider the random dopant simulation. In the comparison, an SRAM bitcell array that has
fluctuation, work function variation, fin width variation, and 256 rows and 128 columns in a 4:1 bit-interleaved structure

VOLUME 9, 2021 64109


J. S. Oh et al.: Differential Read/Write 7T SRAM With Bit-Interleaved Structure for Near-Threshold Operation

FIGURE 11. Comparison of the read delay of the bitcells. FIGURE 12. Worst case scenario and RBL voltage during a read ‘‘0’’
operation in 7T-2.

was assumed and the operating voltage is set to VDD = 0.5 V


in the near-Vth region. 7T SRAM bitcell, the M node is not driven to VDD because
of the turned-off PMR transistor. This implies that VBLB is
A. BITCELL AREA disconnected from the VDD of the selected bitcell. Thus, VBLB
Fig. 10 shows a comparison of the bitcell areas of the during a read ‘‘0’’ operation is slightly reduced because of the
proposed differential 7T SRAM bitcell to that of the pre- leakage current from the unselected bitcells in the selected
viously proposed SRAM bitcells. All the bitcell areas are column compared to VBL during read ‘‘1’’ operation, which
estimated using the 22-nm FinFET technology. To minimize results in a smaller 1VBL . Therefore, TBL_5σ of a read ‘‘0’’
the layout area, the number of fins in all each transistor was operation is slightly larger than that of a read ‘‘1’’ operation.
set as one. In this study, to consider the worst-case, the TBL_5σ of a read
The differential 10T, 12T, and P-P-N 10T SRAM bitcells ‘‘0’’ operation was used for comparison The TBL_5σ in the
have significantly large area overheads because of the large proposed differential 7T SRAM bitcell is 1.55 ns.
number of transistors. In contrast, the proposed differential The proposed differential 7T SRAM bitcell has a smaller
7T SRAM bitcell has a reasonably small area overhead com- TBL_5σ compared to the differential 10T, 12T, and P-P-N
pared to the conventional 6T SRAM bitcell. Although the 10T SRAM bitcells because it exhibits the lowest BL capac-
previously proposed single-ended 8T, 7T-1, and 7-2 SRAM itance owing to the smallest bitcell layout height. More-
bitcells have a smaller bitcell area than the proposed differ- over, in the differential 10T, 12T, and P-P-N 10T SRAM
ential 7T SRAM bitcell, they exhibit problems such as a long bitcells, the VVSS node voltage slightly increases in the read
read delay or sensing failure. operation because charges flow into the VVSS node from the
BL/BLBs in the unselected columns, which degrades the read
B. READ DELAY AND READ STABILITY current. TBL_5σ values in the differential 10T, 12T, and P-P-N
The read delay is comprised of WL decoding, BL develop- 10T SRAM bitcells were 1.7, 1.74, and 1.71 ns, respectively.
ment (TBL ), and sensing/data-out delays. In the differential By considering the WL decoding delay and sensing/data-out
operation, TBL is defined as the time between when the WL delay, the read delays of the proposed differential 7T and the
is enabled and when the SAE is enabled. The SAE should be differential 10T, 12T, and P-P-N 10T SRAMs were 3.15, 3.3,
enabled when the voltage development between the BL and 3.34, and 3.31 ns, respectively (Fig. 11).
BLB (1VBL ) is larger than the offset voltage (VOS ) of the Meanwhile, in the single-ended operation, the sensing fail-
sense amplifier for correct sensing operations. It was assumed ure is counted when the VBL does not reach the 5σ trip voltage
that the mean of the VOS distribution (µVOS ) is zero because of the high-skewed inverter SA. As shown in Fig. 11, 7T-
a sense amplifier has a completely symmetrical structure. 1 and 8T SRAM bitcells have a longer read delay than the
Moreover, the variance of the VOS distribution (σVOS ) was differential SRAM bitcells owing to a larger TBL due to a
determined as 20 mV because the industry target of 3σVOS larger bitline swing. In a 7T-2 with a single transistor read
is typically set as 50–70 mV [25]. TBL is defined as the time buffer, the RBL voltage cannot be fully discharged during
required to ensure the 5σ sensing yield. The sensing yield is a read ‘0’ operation because the current of the unselected
calculated through importance sampling, in which the sensing bitcells in the selected column interrupts the RBL discharge
failure is counted when 1VBL is smaller than 5σVOS [26]. (Fig. 12), which causes the sensing failure.
In the proposed differential 7T SRAM bitcell, TBL_5σ During the read operation, the storage nodes of the bit-
during the read ‘‘0’’ operation is different from that during cells can be disturbed by the BL charge. Even with the BL
the read ‘‘1’’ operation owing to the asymmetrical structure. disturbance, the previously stored data needs to be main-
During the read ‘‘0’’ operation in the proposed differential tained until the WL pulse is terminated. Thus, in this study,

64110 VOLUME 9, 2021


J. S. Oh et al.: Differential Read/Write 7T SRAM With Bit-Interleaved Structure for Near-Threshold Operation

TABLE 2. Read stability yield in the RHSCS during a write operation.

FIGURE 14. Read stability yield in RHSCs according to NTWP1 .

FIGURE 15. Read stability margin at various corners.


FIGURE 13. Waveform of the floated Qb node at various corners.

a stable read operation implies that the previously stored data C. STABILITY OF RHSCS DURING WRITE OPERATION
in the selected bitcells during the read operation is stably In the previously proposed SRAM bitcells with the bit-
maintained until the WL pulse is terminated. When the read interleaved structure, the condition of the RHSCs during the
stability yield is calculated through importance sampling, read and write operations is identical to that of the selected
the following cases are counted as read operation failures; the bitcell during a read operation and thus, their read stability
data in the selected bitcells during the read operation at the yield is identical too.
end of the WL pulse are altered from the previously stored In the proposed differential 7T SRAM bitcell, the distur-
data. bance in the RHSCs during a write operation can be reduced
Table 2 reports the read stability yields in the selected by increasing TW _P1 . Fig. 14 shows the read stability yield
bitcells during the read operation and the RHSCs during the in the RHSCs according to NTWP1 , which was set to 6 to
write operation of the proposed differential 7T and previously achieve a 5σ read stability yield, and so TW _P1 is referred
proposed SRAM bitcells. All of the selected bitcells achieved to as TW _P1_5σ . The read stability margin of the RHSCs at
the 5σ read stability yield during read operation thanks to various corners for NTWP1 of 6 is shown in Fig. 15.
the decoupled read current path from the data node. The read In the RHCSs of the differential 10T, 12T, and P-P-N
‘‘0’’ and ‘‘1’’ stability yields in the proposed differential 7T 10T SRAM bitcells during write operations, the data node
SRAM bitcell differ because it has an asymmetrical structure. is decoupled from the BL, which can stably maintain the
Although the read ‘‘1’’ stability yield is slightly smaller than data node. However, the RHSCs in the previously proposed
the read ‘‘0’’ stability yield (because the floating node Qb can single-ended 7T-1, 7T-2, and 8T SRAM bitcells undergo a BL
be charged owing to the leakage current from VDD and the M disturbance. Thus, these bitcells cannot achieve the 5σ yield,
node during the read ‘‘1’’ operation), a 5σ read stability yield as illustrated by the results in Table 2.
can be achieved. In addition, although the Qb node during
read ‘‘1’’ operation is floated by enabling the WLRB signal, D. WRITE DELAY AND WRITE ABILITY
the stored data cannot be flipped since the data Qb node A write delay consisting of a WL decoding delay, TW _P1 ,
voltage is reduced due to the coupling caused by the WLRB and TW _P2 is longer during a write ‘‘1’’ operation than dur-
transition. Fig. 13 shows the Q and Qb nodes during a read ing a write ‘‘0’’ operation. This is because the Qb node is
‘‘1’’ operation at various corners. At the worst corner (hot discharged by two nMOSs (PMR and PGR) during a write
temperature and degraded VDD ), the floating Qb node voltage ‘‘1’’ operation. TW _P1 and TW _P2 are determined from NTWP1 ,
is increased. However, since TBL is short enough, the Q node the read stability yield in the RHSCs (see Section IV-C) and
voltage does not flip. NTWP2 , the write ability yield, respectively.

VOLUME 9, 2021 64111


J. S. Oh et al.: Differential Read/Write 7T SRAM With Bit-Interleaved Structure for Near-Threshold Operation

FIGURE 16. Write ability yield according to NTWP2 .

FIGURE 18. Write ability yields and write assist voltage for the 5σ write
ability yields of the bitcells.

they suffer from BL disturbance. The proposed differential


7T SRAM bitcell exhibited a higher write ability yield than
the differential 10T and P-P-N 10T SRAM bitcells because
the latter two bitcells incur a disturbance from the VVSS.
The P-P-N 10T SRAM bitcell showed the lowest write ability
yield since the nMOS-pMOS stack in the write current path
FIGURE 17. Write ability margin at various corners.
cannot transfer full ‘‘0’’ and ‘‘1’’.
A write assist circuit is necessary for the differential 10T,
P-P-N 10T, and proposed 7T SRAM bitcells to ensure a 5σ
When the data stored in the selected bitcell differs from the write ability yield. Among the write assist circuits, nega-
written data, the voltage level of the internal node storing ‘‘1’’ tive VBL write assist exhibits the best efficiency because it
(‘‘0’’) should be lower (higher) than the metastable point (the increases the VDS and VGS of the pass-gate transistor [27].
trip point of cross-coupled inverters) when the WL pulse is However, it needs to toggle numerous column-based BLs and
terminated for a stable write operation. The write ability yield large capacitors, which significantly increases the area over-
is calculated by finding the number of write failures that do head and energy consumption. On the other hand, boosted
not satisfy the aforementioned requirement. VWL write assist enhances the pass-gate transistor by increas-
Fig. 16 shows the write ability yield according to NTWP2 : ing VGS . In addition, it needs to toggle only one row-based
the former is improved as the latter increases. However, WL, which reduces energy consumption for the write assist.
the write ability yield is saturated because it is independent Thus, it was used to achieve a 5σ write ability yield in this
of the write time and determined by the strength ratio of paper.
the pull-up to the pass-gate transistors when the write time The P-P-N 10T SRAM bitcell cannot achieve a 5σ write
is sufficient with a large NTWP2 . NTWP2 is set to 3. This ability yield even with boosted VWL write assist owing to the
minimizes the write delay without significantly degrading nMOS-pMOS stack in the write current path. The required
the write ability yield. In this case, TW _P2 was determined VWL assist levels to achieve a 5σ write ability yield for the
as optimal, i.e., TW _P2_optimal . The write ability margin for proposed differential 7T and differential 10T SRAM bitcells
NTWP2 of 3 at various corners is shown in Fig. 17. The sum were 65 and 90 mV, respectively. Hence, the differential 10T
of TW _P2_optimal and TW _P1_5σ is 2.26 ns, and considering SRAM bitcell requires a higher boosted VWL assist level than
the WL decoding delay, the write delay of the proposed the proposed 7T SRAM bitcell because of a lower write
differential 7T SRAM is 2.73 ns. ability yield.
Fig. 18 shows the write ability yields and write assist
voltages for the 5σ write ability yields of the bitcells; the E. SRAM OPERATING FREQUENCY
single-ended 7T-2 SRAM bitcell is excluded because the 7T-2 The SRAM operating frequency in the proposed differential
SRAM bitcell suffers from a read sensing failure issue. The 7T SRAM was determined as the highest between the read
differential 12T SRAM bitcell achieved the highest write and write delays. According to Sections IV-B and D, the read
ability yield because the pull-up network is completely turned delay (3.15 ns) is longer than the write delay (2.73 ns). Thus,
off. The single-ended 7T-1 and 8T SRAM bitcells have higher the SRAM operating frequency in the differential proposed
write ability yield than proposed differential 7T, 10T, and 7T SRAM was determined as 317 MHz according to the read
PPN 10T SRAM bitcells because there is only one nMOS delay.
pass gate transistor in the write path. However, the RHSCs In contrast, it is apparent that the operating frequency in the
during write operation in the single-ended 7T-1 and differential 10T and 12T SRAMs are determined by the read
8T SRAM bitcells cannot achieve 5σ read stability because delay because the timing overhead to achieve 5σ read stability

64112 VOLUME 9, 2021


J. S. Oh et al.: Differential Read/Write 7T SRAM With Bit-Interleaved Structure for Near-Threshold Operation

FIGURE 20. Comparison of energy-delay product and energy-delay-area


product.
FIGURE 19. Comparison of read, write and total operation energies.

the write operation is the lowest in the proposed 7T SRAM.


in the RHSCs during a write operation is not required. Thus, However, the differential 10T SRAM has the lowest write
their operating frequencies are 303 MHz and 299 MHz, energy because of the following reason. In the differen-
respectively. Hence, the SRAM operating frequency of the tial 12T and proposed 7T SRAMs, the BL or BLB in the
proposed differential 7T SRAM is higher than that of the unselected columns is discharged during a write operation.
differential 10T and 12T SRAM by 5% and 6%, respectively. In contrast, in the differential 10T SRAM, neither the BL
nor BLB in the unselected columns is discharged because
F. ENERGY CONSUMPTION AND STANDBY POWER the VVSS remains at VDD. Thus, the proposed differential
Fig. 19 shows a comparison of the energy consumption of 7T SRAM consumes write energy that is 62% lower than
the whole macro (256 rows and 128 columns), including that of the differential 12T SRAM but 36% higher than the
peripheral circuits, for the 7T-1, 8T, P-P-N 10T, proposed differential 10T SRAM. However, the total operation energy
differential 7T, differential 10T, and 12T SRAMs. The energy is dominantly determined by the read energy because a read
consumption was measured during an operational period by operation is mainly performed and a write operation occurs
considering the dynamic and static energies. The energy con- when cache hit occurs. Considering the average read/write
sumption in the proposed differential 7T SRAM was com- operation ratio of 7:1 in [28], the proposed differential 7T
pared to those of the 7T-1, 8T, P-P-N 10T, differential 10T and SRAM has a total operation energy consumption that is 70%
12T SRAMs excluding 7T-2 SRAM because the 7T-2 SRAM and 23% lower than that of the differential 10T and 12T
cannot achieve stable read operation. SRAM bitcells, respectively. However, the 7T-1, 8T, and P-
During a read operation, the 7T-1 and 8T SRAMs with P-N 10T SRAMs are excluded because the 7T-1 and 8T
single-ended read operation has 41% and 53% smaller read SRAM bitcells cannot achieve the 5 σ read stability yield at
energy than the proposed differential 7T with differential read the RHSCs in the write operation with the bit-interleaving
operation due to twice BL switching activity. In the SRAMs structure and the P-P-N 10T SARM bitcell cannot achieve
with differential read operation, the read energy consumption 5σ write ability yield even with boosted VWL write assist.
in the proposed differential 7T SRAM is lower than those of The standby power is measured at the minimum data reten-
the P-P-N 10T, differential 10T, and 12T SRAMs by 72%, tion voltage (VDR ), which ensures the 5σ hold stability yield.
73%, and 7%, respectively. The reason is as follows. The Since the proposed differential 7T, and differential 10T and
energy consumption is dominantly influenced by the BL/BLB 12T SRAM bitcells have a cross-coupled inverter structure,
capacitance, which is determined by the bitcell layout height. the VDR of these cells is 0.275 V [26]. The simulation results
The P-P-N 10T, differential 10T and 12T SRAM bitcells have reveal that the proposed differential 7T SRAM has a higher
a larger bitcell layout height than the proposed differential standby power of 154.9 pW than the differential 10T and
7T SRAM bitcell due to the larger number of transistors. 12T SRAMs because the sub-threshold leakage from the BL
In addition, in the P-P-N 10T and differential 10T SRAM to the storage node decreases with the stack effect due to
bitcells, a high VVSS capacitance caused by sharing all of the stacked nMOS in the differential 10T and 12T SRAMs.
the bitcells in the four columns is toggled. The differential 12T SRAM has slightly lower standby power
During a write operation, the column-based signals in of 138.9 pW than the differential 10T SRAM of 141 pW
the differential 10T and 12T SRAMs need to be toggled because the additional pMOS transistor cannot transfer full
in all of the selected columns, which incurs high energy ‘‘0’’, which decreases the VDS of the pull-up transistor.
consumption. In contrast, the row-based WL and WLRB Fig. 20 shows the energy-delay product (EDP) and
signals in the proposed differential 7T SRAM are toggled in leakage-energy-delay product (LEDP) of the proposed differ-
only the selected row, which requires less energy. Therefore, ential 7T, and differential 10T and 12T SRAMs. Because the
the energy consumed during toggling of the control signals in proposed differential 7T SRAM has the shortest delay and

VOLUME 9, 2021 64113


J. S. Oh et al.: Differential Read/Write 7T SRAM With Bit-Interleaved Structure for Near-Threshold Operation

lowest operation energy consumption, its EDP is 72% and [5] T.-H. Kim, J. Liu, J. Keane, and C. H. Kim, ‘‘A 0.2 V, 480 kb subthreshold
28% less than the differential 10T and 12T SRAMs, respec- SRAM with 1 k cells per bitline for Ultra-Low-Voltage computing,’’ IEEE
J. Solid-State Circuits, vol. 43, no. 2, pp. 518–529, Feb. 2008.
tively. In addition, although the standby power of the pro- [6] K. Takeda, Y. Hagihara, Y. Aimoto, M. Nomura, Y. Nakazawa, T. Ishii, and
posed differential 7T SRAM is slightly higher than the others H. Kobatake, ‘‘A read-static-noise-margin-free SRAM cell for low-VDD
with a single nMOS pass gate, the LEDP of the proposed and high-speed applications,’’ IEEE J. Solid-State Circuits, vol. 41, no. 1,
pp. 113–121, Jan. 2006.
differential 7T SRAM is 69% and 19% lesser than the dif- [7] M.-F. Chang, M.-P. Chen, L.-F. Chen, S.-M. Yang, Y.-J. Kuo, J.-J. Wu,
ferential 10T and 12T, respectively, thanks to it having the H.-Y. Su, Y.-H. Chu, W.-C. Wu, T.-Y. Yang, and H. Yamauchi, ‘‘A sub-0.3
lowest EDP. V area-efficient L-shaped 7T SRAM with read bitline swing expansion
schemes based on boosted read-bitline, asymmetric-VTH read-port, and
offset cell VDD biasing techniques,’’ IEEE J. Solid-State Circuits, Vol. 48,
V. CONCLUSION no. 10, pp. 2558–2569, Oct. 2013.
[8] N. Verma and A. Chandrakasan, ‘‘A 256 kb sub-threshold SRAM in
In the near-Vth region with a bit-interleaved structure, 65 nm CMOS,’’ IEEE J. Solid-State Circuits, vol. 42, no. 3, pp. 680–688,
the RHSCs of the conventional 6T, 8T, and 7T SRAM bitcells Mar. 2007.
undergo BL/BLB disturbance. The problem is resolved by [9] Y. Morita, H. Fujiwara, H. Noguchi, Y. Iguchi, K. Nii, H. Kawaguchi,
and M. Yoshimoto, ‘‘An area-conscious low-voltage-oriented 8T-SRAM
using column-based write WL in differential 10T and 12T design under DVS environment,’’ in Proc. IEEE Symp. VLSI Circuits,
SRAM bitcells and pseudo-data nodes in the P-P-N 10T Kyoto, Japan, Jun. 2007, pp. 256–257.
SRAM bitcell. However, these bitcells have several problems, [10] I. J. Chang, J.-J. Kim, S. P. Park, and K. Roy, ‘‘A 32 kb 10T sub-
threshold SRAM array with bit-interleaving and differential read scheme
such as a large bitcell area, a long delay, and high energy in 90 nm CMOS,’’ IEEE J. Solid-State Circuits, vol. 44, no. 2, pp. 650–658,
consumption. Hence, the differential 7T SRAM bitcell with Feb. 2009.
an additional PMR nMOS transistor between the PUR and [11] Y.-W. Chiu, Y.-H. Hu, M.-H. Tu, J.-K. Zhao, Y.-H. Chu, S.-J. Jou, and
C.-T. Chuang, ‘‘40 nm bit-interleaving 12T subthreshold SRAM with data-
PDR is proposed to mitigate these problems. During a read aware write-assist,’’ IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 61,
operation, the PMR is turned off by the WLRB of VSS, which no. 9, pp. 2578–2585, Sep. 2014.
improves the read stability. Moreover, the differential write [12] C.-H. Lo and S.-Y. Huang, ‘‘P-P-N based 10T SRAM cell for low-leakage
and resilient subthreshold operation,’’ IEEE J. Solid-State Circuits, vol. 46,
operation can be performed by using a pulsed WLRB sig- no. 3, pp. 695–704, Mar. 2011.
nal during a write operation, thereby resolving the -selected [13] K. Ronse, P. De Bisschop, G. Vandenberghe, E. Hendrickx, R. Gronheid,
issue. The bitcell area of the differential 7T SRAM bitcell A. V. Pret, A. Mallik, D. Verkest, and A. Steegen, ‘‘Opportunities and
challenges in device scaling by the introduction of EUV lithography,’’ in
is 33%, 49%, and 37% smaller than the differential 10T, 12T, IEDM Tech. Dig., San Francisco, CA, USA, Dec. 2012, pp. 18.5.1–18.5.4.
and P-P-N 10T SRAM bitcells, respectively, and its operating [14] S. Khandelwal et al. BSIM-CMG 107.0.0 Multi-Gate MOSFET Com-
frequency is 5% and 6%, higher than those of the differ- pact Model. Berkley Education. [Online]. Available: https://www-device.
eecs.berkeley.edu/bsim/?page= BSIMCMG_LR
ential 10T and 12T, respectively. The proposed differential [15] C. Auth et al., ‘‘A 22 nm high performance and low-power CMOS technol-
7T SRAM has higher write energy consumption than the ogy featuring fully-depleted tri-gate transistors, self-aligned contacts and
differential 10T SRAM owing to bitline discharging in the high density MIM capacitors,’’ in Proc. Symp. VLSI Technol., Honolulu,
HI, USA, Jun. 2012, pp. 131–132.
unselected columns but lower than the 12T SRAM in which [16] M. Shrivastava, B. Verma, M. S. Baghini, C. Russ, D. K. Sharma,
VVSS is toggled. However, the differential 10T SRAM has H. Gossner, and V. R. Rao, ‘‘Benchmarking the device performance at sub
the highest read operation energy due to the toggling of the 22 nm node technologies using an SoC framework,’’ in IEDM Tech. Dig.,
Baltimore, MD, USA, Dec. 2009, pp. 1–4.
VVSS. Considering the read/write operation ratio, the pro-
[17] C. Millar, D. Reid, G. Roy, S. Roy, and A. Asenov, ‘‘Accurate statistical
posed differential 7T consumes 70% and 23% lower total description of random dopant-induced threshold voltage variability,’’ IEEE
energy than the differential 10T and 12T SRAM, respectively. Electron Device Lett., vol. 29, no. 8, pp. 946–948, Aug. 2008.
Moreover, the LEDP of the proposed differential 7T SRAM [18] C. H. Lin, R. Kambhampati, R. J. Miller, T. B. Hook, A. Bryant,
W. Haensch, P. Oldiges, I. Lauer, T. Yamashita, V. Basker, T. Standaert,
was 69% and 19% less than those of the differential 10T K. Rim, E. Leobandung, H. Bu, and M. Khare, ‘‘Channel doping impact on
and 12T SRAMs, respectively. In conclusion, the proposed FinFETs for 22 nm and beyond,’’ in Proc. Symp. VLSI Technol., Honolulu,
differential 7T SRAM bitcell achieved higher performance HI, USA, Jun. 2012, pp. 15–16.
[19] D. Ingerly et al., ‘‘Low-K interconnect stack with metal-insulator-metal
and operational yield with a smaller area along with low capacitors for 22 nm high volume manufacturing,’’ in Proc. IEEE Int.
energy consumption. Interconnect Technol. Conf., San Jose, CA, USA, Jun. 2012, pp. 249–251.
[20] E. Seevinck, F. J. List, and J. Lohstroh, ‘‘Static-noise margin analysis
of MOS SRAM cells,’’ IEEE J. Solid-State Circuits, vol. 22, no. 5,
REFERENCES pp. 748–754, Oct. 1987.
[21] Z. Guo, A. Carlson, L.-T. Pang, K. T. Duong, T.-J.-K. Liu, and B. Nikolic,
[1] D. Markovic, C. C. Wang, L. P. Alarcon, T.-T. Liu, and J. M. Rabaey, ‘‘Large-scale SRAM variability characterization in 45 nm CMOS,’’ IEEE
‘‘Ultralow-power design in near-threshold region,’’ Proc. IEEE, vol. 98, J. Solid-State Circuits, vol. 44, no. 11, pp. 3174–3192, Nov. 2009.
no. 2, pp. 237–252, Feb. 2010.
[22] R. V. Joshi, S. Mukhopadhyay, D. W. Plass, Y. H. Chan, C.-T. Chuang, and
[2] W.-K. Chen, Linear Networks and Systems. Belmont, CA, USA: Y. Tan, ‘‘Design of sub-90 nm low-power and variation tolerant PD/SOI
Wadsworth, 1993, pp. 123–135. SRAM cell based on dynamic stability metrics,’’ IEEE J. Solid-State
[3] P. Hazucha, T. Karnik, J. Maiz, S. Walstra, B. Bloechel, J. Tschanz, Circuits, vol. 44, no. 3, pp. 965–976, Mar. 2009.
G. Dermer, S. Hareland, P. Armstrong, and S. Borkar, ‘‘Neutron soft error [23] J. Wang, S. Nalam, and B. H. Calhoun, ‘‘Analyzing static and dynamic
rate measurements in a 90-nm CMOS process and scaling trends in SRAM write margin for nanometer SRAMs,’’ in Proc. 13th Int. Symp. Low Power
from 0.25-µm to 90-nm generation,’’ in IEDM Tech. Dig., Washington, Electron. Design (ISLPED), Bangalore, India, Aug. 2008, pp. 129–134.
DC, USA, Dec. 2003, pp. 21.5.1–21.5.4. [24] T. S. Doorn, E. J. W. ter Maten, J. A. Croon, A. Di Bucchianico, and
[4] J. Maiz, S. Hareland, K. Zhang, and P. Armstrong, ‘‘Characterization of O. Wittich, ‘‘Importance sampling Monte Carlo simulations for accurate
multi-bit soft error events in advanced SRAMs,’’ in IEDM Tech. Dig., estimation of SRAM yield,’’ in Proc. 34th Eur. Solid-State Circuits Conf.
Washington, DC, USA, Dec. 2003, pp. 21.4.1–21.4.4. (ESSCIRC), Edinburgh, U.K., Sep. 2008, pp. 230–233.

64114 VOLUME 9, 2021


J. S. Oh et al.: Differential Read/Write 7T SRAM With Bit-Interleaved Structure for Near-Threshold Operation

[25] T. Na, S.-H. Woo, J. Kim, H. Jeong, and S.-O. Jung, ‘‘Comparative study of KEONHEE CHO (Member, IEEE) was born in
various latch-type sense amplifiers,’’ IEEE Trans. Very Large Scale Integr. Seoul, South Korea, in 1994. He received the
(VLSI) Syst., vol. 22, no. 2, pp. 425–429, Feb. 2014. B.S. degree in electrical and electronic engineering
[26] K. Cho, J. Park, T. W. Oh, and S. Jung, ‘‘One-sided schmitt-trigger-based from Yonsei University, Seoul, in 2018, where he
9T SRAM cell for near-threshold operation,’’ IEEE Trans. Circuits Syst. I, is currently pursuing the Ph.D. degree in electrical
Reg. Papers Reg. Papers, vol. 67, no. 5, pp. 1551–1561, May 2020. and electronic engineering. His research interests
[27] M.-H. Tu, J.-Y. Lin, M.-C. Tsai, C.-Y. Lu, Y.-J. Lin, M.-H. Wang, are focused on near-threshold SRAM cell design
H.-S. Huang, K.-D. Lee, W.-C. Shih, S.-J. Jou, and C.-T. Chuang,
and low-voltage SRAM peripheral circuit design.
‘‘A single-ended disturb-free 9T subthreshold SRAM with cross-point
data-aware write word-line structure, negative bit-line, and adaptive read
operation timing tracing,’’ IEEE J. Solid-State Circuits, vol. 47, no. 6,
pp. 1469–1482, Jun. 2012.
[28] Y. Xie, Emerging Memory Technologies: Design, Architecture, and Appli-
cations. New York, NY, USA: Springer, 2013, p. 187. TAE WOO OH (Member, IEEE) was born in
Seoul, South Korea, in 1992. He received the
B.S. degree in electrical and electronic engineering
from Yonsei University, Seoul, in 2015, where he
is currently pursuing the Ph.D. degree in electrical
and electronic engineering. His research interests
are focused on low-power and high-speed SRAM
JI SANG OH (Member, IEEE) was born in Seoul, and next-generation semiconductor devices.
South Korea, in 1996. He received the B.S. degree
in electrical and electronic engineering from Yon-
sei University, Seoul, Republic of Korea, in 2021,
where he is currently pursuing the Ph.D. degree in
electrical and electronic engineering. His research SEONG-OOK JUNG (Senior Member, IEEE)
interests are focused on FinFET-based low-power received the B.S. and M.S. degrees in elec-
and high-performance SRAM cells. tronic engineering from Yonsei University, Seoul,
South Korea, in 1987 and 1989, respectively, and
the Ph.D. degree in electrical engineering from
the University of Illinois at Urbana–Champaign,
Urbana, IL, USA, in 2002. From 1989 to 1998,
he worked with Samsung Electronics Company
Ltd., Hwasung, South Korea, where he was
involved with specialty memories, such as video
JUHYUN PARK (Member, IEEE) was born in RAM, graphic RAM, and window RAM. He was with T-RAM Inc.,
Incheon, South Korea, in 1988. He received the Mountain View, CA, USA, where he was the Leader of the Thyristor-
B.S. degree in electronic and electrical engineer- Based Memory Circuit Design Team. From 2003 to 2006, he worked with
ing from Hongik University, Seoul, South Korea, Qualcomm Inc., San Diego, CA, USA, where he was involved in high-
in 2012, and the Ph.D. degree in electrical and elec- performance low-power embedded memories, process variation tolerant cir-
tronic engineering from Yonsei University, Seoul, cuit design, and low-power circuit techniques. Since 2006, he has been a
in 2020. He joined SK Hynix Inc., Icheon, in 2020, Professor with Yonsei University. His current research interests include pro-
where he is involved in mobile DRAM design. cess variation-tolerant circuit design, low-power circuit design, mixed-mode
circuit design, and next-generation memory and technology.

VOLUME 9, 2021 64115

You might also like