0% found this document useful (0 votes)
15 views

Ripple-Precharge TCAM (Network Search Engines - Low Power Solution)

Uploaded by

anhquangpham9090
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

Ripple-Precharge TCAM (Network Search Engines - Low Power Solution)

Uploaded by

anhquangpham9090
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/4182416

Ripple-precharge TCAM: A low-power solution for network search engines

Conference Paper · November 2005


DOI: 10.1109/ICCD.2005.95 · Source: IEEE Xplore

CITATIONS READS
32 161

4 authors, including:

Mehrdad Nourani Poras Balsara


University of Texas at Dallas University of Texas at Dallas
296 PUBLICATIONS 4,671 CITATIONS 190 PUBLICATIONS 4,948 CITATIONS

SEE PROFILE SEE PROFILE

All content following this page was uploaded by Mehrdad Nourani on 01 December 2014.

The user has requested enhancement of the downloaded file.


Ripple-Precharge TCAM:
A Low-Power Solution for Network Search Engines
Deepak S Vijayasarathi, Mehrdad Nourani, Mohammad J. Akhbarizadeh, Poras T. Balsara
Center for Integrated Circuits & Systems
The University of Texas at Dallas
Richardson, TX 75083
dxv033000,nourani,eazadeh,poras@utdallas.edu

A BSTRACT to frequent transitions on the highly capacitive match line. Se-


A novel low power ripple-precharge Ternary CAM (RP- lective precharging of match line has been one of the favorite
TCAM) architecture is proposed for applications in longest pre- techniques followed by many researchers in CAM to reduce the
fix matching tasks. The main motivation behind this research is number of transitions in match line [1]. Few others have tried to
to reduce the dynamic power consumption in TCAM due to fre- minimize the signal swing in the match line either by reducing
quent charging and discharging of the highly capacitive match the precharge and discharge levels [2] or by using diode pull
line. This issue is addressed by exploiting the fact that when down switches [1]. Another completely lateral technique to re-
we compare only the first four bits of incoming packet’s des- duce power in CAM is by minimizing the voltage swing on the Voltage
swing, signal
tination address we can identify up to 80% mismatches in the search lines of the CAM when applying the input comparand swing: refer to
forwarding table. A selective precharge scheme was devised ex- for search operation. Sheikholeslami et al. [2] proposed the the range of
voltage levels

ploiting the above fact wherein the match line is charged only concept of local search lines having low swing receivers which
when there is an exact match in the first four bits of TCAM enables the global search lines to have minimum voltage swing.
word, thereby significantly reducing the number of transitions The low swing receivers which reside in the local search lines
in the match line. The parasitics for simulation were extracted then amplifies this small voltage to obtain the desired voltage
from the layout implemented for a 64  32 RP-TCAM architec- level. Hsaio et al. [3] set-forth guidelines to design low power
ture using 0 18µm technology. This structure has 1.71% less CAM based on their power models. They also minimized the
area and 80% less power when compared to the conventional power consumption in their CAM by having two variations of
TCAM of equal storage size and functionality. Our RP-TCAM NAND types CAM bit in their structure and also by carefully
architecture has a search time of 1.86ns. crafting their layouts [3]. Efthymiou et al. [4] proposed a mixed
serial-parallel CAM for use in caches which exploits the the ad-
dress patterns commonly found in application programs.
I. I NTRODUCTION Another problem of TCAM in addition to high power dis-
Content addressable memory (CAM) is a fully associative sipation, is its low storage density, due to the high number of
memory with a search time of only one clock cycle unlike the transistors per each cell. Each TCAM cell requires 16 transis-
traditional random access memory (RAM) that requires two or tors (14 in some literature [5]), as opposed to 6 for SRAM or
more clock cycles for a search operation. They find wide range 2 for DRAM [6]. This has inspired some researchers to offer
of applications in cache memories, network search engines, heuristics that optimize TCAM usage [7]. In the state of the art
telecommunication and cryptography. CAM is broadly clas- TCAM technology, any bit in a word can be masked indepen-
sified into two types - binary CAM and ternary CAM (TCAM). dently. This flexibility comes at a cost. Each cell includes two
Binary CAM is primarily used as instruction or data cache SRAM (DRAM) bits to be able to store each of the three possi-
while ternary CAM which has an additional “don’t-care state” ble states of the cell, namely 0, 1, and don’t-care. In an earlier
is mainly used for the longest prefix matching tasks in network work, we have offered an optimized TCAM cell that employs
search engines. One of the major issues in TCAM, or in gen- w  1 RAM bits (instead of conventional 2w bits) for a word of
eral any CAM, is their very high dynamic power consumption. size w [8]. This structure, called prefix CAM (PCAM) employs
The main reason behind this issue is the fully parallel nature of about 22% less transistors than a conventional TCAM, for equal
search operation. This fully parallel search operation causes all storage size and equal functionality. However, power saving of
the match lines in a TCAM block to charge in their precharge PCAM compared to TCAM was not significant.
phase and allows all but one match line to discharge during their
evaluation phase. The one match line which does not discharge B. Main Contribution
during evaluation phase indicates the match in the search oper-
ation. A low power ripple-precharge TCAM (RP-TCAM) archi-
tecture, utilized for longest prefix matching task, is proposed
which consumes about 80% less power compared to the con-
A. Prior Work ventional TCAM architecture. RP-TCAM has equal area and
Most of the previous work related to power reduction in performance as that of a conventional TCAM. The key nov-
CAM concentrated mainly on reducing the dynamic power due elty in this paper is twofold. Firstly, the precharge voltage to
Proceedings of the 2005 International Conference on Computer Design (ICCD’05)
0-7695-2451-6/05 $20.00 © 2005 IEEE
Timing signal: used to control the sequencing and coordination of various operations (recharge, evaluating, sensing the match lines)
=> reduces power consumption by simplifying the control and synchronization mechanisms. This can be achieved by optimizing the architecture to minimize the need for complex timing signals or by using
alternative techniques to achieve efficient operation without explicit timing control.
Data
evaluate the parallel search operation is selectively and serially
rippled through the first four most significant serial CAM bits.
Data I/O Interface
This idea exploits the fact that by just comparing the first four BL w-1 BL w-2 ... BL
0
most significant bits of the TCAM word we can identify up to
80% of search mismatches. The second novelty in our proposed WL 0 ML 0
b
w-1
b
w-2
... b
0

Address Decoder

Priority Encoder
RP-TCAM architecture is that it does not have any timing sig-
WL 1 ML 1
nals. By eliminating the timing signals, we not only reduce the b
w-1
b
w-2
... b
0
Output
RAM
dynamic power, dissipated due to charging and discharging of Best (e.g. next hop)

...

...

...

...

...
Match
such highly capacitive node, but also save energy consumed by WL n-1 ML n-1
their drivers. b ... b
bw-1 w-2 0

... Packet forwarding: process of directing


C. Paper Organization CMPw-1 CMPw-2 CMP 0
network packets from one network device
to another based on the destination
Address address of the packet
The rest of this paper is organized as follows. Section II Comparand
describes the background of packet forwarding using TCAM (e.g. packet’s destination address)
and different low power techniques in TCAM. The description
Figure 1. Packet forwarding using TCAM.
of RP-TCAM, a customized CAM-TCAM architecture, is ex-
plained in Section III. The implementation details and the ex-
TABLE I
perimental results are presented in Section IV. Finally, con-
C OMPARISON IN TCAM CELL
cluding remarks are in Section V.
D CMP X
II. B ACKGROUND
0 0 0
A. Packet Forwarding Using TCAM 0 1 1
In a network router, packet forwarding is carried out by de- 1 0 1
termining the next hop destination address from the forwarding 1 1 0
table maintained in the router. A longest prefix match is done
on the entries of the forwarding table for the destination address
of the incoming packet. An entirely hardware based solution
for carrying out longest prefix match task is using a TCAM. which stores the mask bit (MD) and a 2T XOR gate that com-
A typical implementation of a n  w TCAM based architecture pares the result of the CAM operation (X) with complement
is shown in Figure 1. The forwarding table entries in TCAM of mask bit (MDB) to determine whether we have a match or
consists of destination addresses along with their corresponding not. The contents of both the SRAMs are read and written us-
mask value. The destination address of an incoming packet to ing the bit lines (BL) and (BLB). (WL) and (MWL) lines are
the router is compared with all the destination addresses stored made active high for any Read/Write operation involving data
in the TCAM to determine whether we have a match or not. or mask bit. The behavior of the TCAM cell is summarized
The priority encoder which follows the TCAM finds the best in Table I and II. The comparand or key for the search oper-
match, i.e. the longest prefix matching in network applications. ation is fed to the CAM cell using the comparand lines (CMP
The entries are sorted such that the longest prefix destination and CMPB). The match line (ML) which is connected to one
address has the highest priority and is stored in the lower ad- end of the 2-input XOR gate is precharged to Vdd before any
dresses of TCAM and vice versa. There are many software search operation while other end is connected to ground. The
based solutions in existence to address the fast lookup of desti- 2-input XOR gate is controlled by the output of the CAM cell
nation address like Patricia tree or binary search algorithm [9] (X) and the mask bit (MDB). Depending on the output of CAM
but still the hardware based solution of having TCAM for IP cell and the mask bit, the ML either discharges when there is
forwarding is more advantageous solely because of their very a mismatch or retains the charge on match. Typically during a
high search speed and simplicity. The main disadvantage of mismatch, the mask bit MD 0 shows that the bit is a “care” bit
the hardware based lookup using TCAM is its very high power and the CAM cell output X 1 shows that there is a mismatch
consumption due to simultaneous switching activity of almost between data and key. See the second row in Table II.
all cells. As an example of this fact, the average power con-
sumption of today’s TCAM chips is in the range of 10 to 20 TABLE II
Watts [10]. Minimizing the power consumption is the main fo- B EHAVIOR OF THE EVALUATION LOGIC IN TCAM.
cus of this research.
MD X ML Meaning
8T - 8 Transistors => 8T CAM (CAM-BIT)
B. TCAM Architecture 0 0 Vdd “care” bit; match
Figure 2 shows the basic structure of the conventional TCAM 0 1 0 “care” bit; mismatch
bit. The TCAM model we have assumed consists of three main 1 0 Vdd “don’t-care” bit; match
components - a 8T CAM cell which stores the data bit and also 1 1 Vdd “don’t-care” bit; match
compares its content (D) with the key (CMP), a 6T SRAM cell
Proceedings of the 2005 International Conference on Computer Design (ICCD’05)
0-7695-2451-6/05 $20.00 © 2005 IEEE
CAM-BIT CAM-BIT
WL WL

BLB DB BL
BLB DB BL D
D

CMPB CMP
CMPB CMP

X X

EVALUATION ML_IN ML_OUT


LOGIC
MDB EVALUATION LOGIC

ML Figure 3. Ripple-precharge CAM cell.

TABLE III
B EHAVIOR OF THE EVALUATION LOGIC IN RP-TCAM.
MD
MDB

D CMP X ML OUT
0 0 0 Vdd
MWL
0 1 1 0
MASK-BIT
1 0 1 Vdd
Figure 2. A conventional TCAM cell. 1 1 0 0

C. Low-Power Techniques in TCAM


and discharging search line alone during every search opera-
There are variety of techniques by which we can reduce the
tion dissipates a lot of power. This problem is rightly addressed
power consumed in a TCAM. Inherently the structure of the
by Sheikholeslami et al. wherein they propose a hierarchical
TCAM bit itself has two SRAM cells which are major contrib-
search line scheme to reduce the voltage swing of the highly
utors of static power. Many circuit techniques are already in
capacitive search line [2]. The hierarchical search line scheme
place to reduce the static power in SRAMs [11]. The other ma-
consists of a low swing receiver followed by a local search line
jor source of power dissipation in TCAM, which is the primary
for every TCAM bit along with the global search line. Only a
concern in this work, is the dynamic power dissipation. The
very small voltage is applied to the global search line which is
source of dynamic power dissipation in TCAM can be classi-
then amplified by the low swing receiver and fed into the local
fied under one of these three categories :
search line.
1) Power consumed due to frequent charging and discharg- Several architectures proposed earlier for selective precharg-
ing of the highly capacitive match line. ing used one or more transistor usually for discharging the accu-
2) Power dissipated due to charging and discharging of the mulated charge in the nodes connecting any two serial transis-
search lines when the comparand or the key is applied. tors or the node connecting the serial transistor with the match
3) Power dissipated due to any frequently transitioning line [1]. However, these transistors are controlled by a fre-
clock node. quently varying signal such as clock to switch them “ON” ev-
These three categories are in coherence with the criteria put- ery cycle to discharge the accumulated charge. Unfortunately,
forth by researchers for any low power CAM design [3]. These the capacitance of this clock node further increases when the
criteria were put-forth based on the power models. Selective clock node spans through the entire TCAM block and frequent
precharging has been one of the most predominant techniques charging and discharging of this node increases the dynamic
used to reduce the frequent charging and discharging of match power consumption. Moreover, the drivers, used to drive these
line node capacitance. A different approach to reduce power is clock nodes, also consume power. Our proposed architecture
by reducing the voltage swing across the match line. The power addresses these issues by eliminating any frequently varying
dissipated due to charging search line is also another major con- signals thereby reducing the dynamic power consumption.
tributor to dynamic power in TCAM. Traditionally, the search
lines of the CAM are combined with their corresponding bit
III. R IPPLE -P RECHARGE TCAM A RCHITECTURE
lines. This increases the capacitance in the search line due to
addition of the one extra drain capacitance by each SRAM cell A. Concept of Ripple-Precharge
on the bit line. This problem can be resolved by having sepa- The structure of the ripple-precharge CAM cell is shown
rate search and bit lines. Even after separating the search line in Figure 3. The main difference between the conventional
from the bit line, we see that search line through which we send TCAM cell shown in Figure 2 and this cell is in the evaluation
the key for comparison is by itself highly capacitive. Charging logic. The series NMOS transistors of the evaluation circuit in
Proceedings of the 2005 International Conference on Computer Design (ICCD’05)
0-7695-2451-6/05 $20.00 © 2005 IEEE
Ripple-Precharge CAM Cells Conventional TCAM Cells

CAM-BIT 31 CAM-BIT 30 CAM-BIT 29 CAM-BIT 28 CAM-BIT 27 CAM-BIT 26 ... CAM-BIT 0

Vdd
X31 X30 X29 X28 X27 X26
... X0

ML
...
MDB27 MDB26
MDB0

LA

MASK-BIT 27 MASK-BIT 26
... MASK-BIT 0

Figure 4. 32-bit word of the proposed RP-TCAM architecture.

TCAM is replaced with a PMOS transistors in the new CAM. bit as shown in Table III. If this 0 starts to ripple through the
The source of the PMOS transistor is connected to ML IN and subsequent bits i.e. considering subsequent bits to have an exact
its drain is connected to the drain of the discharging NMOS match, we might end up having a Vt (threshold voltage) drop at
transistor whose source is grounded. The node which connects match line (ML). In order to avoid this and have a clear distinc-
PMOS and the NMOS transistor is connected to the ML OUT tion between a match and mismatch, an inverter followed by
line which is in turn connected to the ML IN line of the next a discharge transistor is connected to ML as shown in Figure
bit. Therefore, the PMOS transistors in series ripple Vdd only 4. The input of the inverter (LA) is connected to the ML OUT
when there is an exact match in the most significant four bits of the 29th bit of the RP-TCAM word. This is done in order
of the RP-TCAM word. The additional NMOS transistor con- to forsee a Vt rippling through the series PMOS transistor and
nected to the ML OUT node of the parallel PMOS transistors avoid having Vt drop in match line for a mismatch.
ripples a 0 whenever there is a mismatch. The match line in The number of CAM bits through which the Vdd is to be rip-
the parallel part of the RP-TCAM is thus selectively charged pled is quite small for all practical packet traces and routing
by the rippling Vdd only when there is a match in the first 4 tables that we have tried so far. In what follows, show one
most significant bits. The behavior of the evaluation logic for a such simulation. The simulation results are obtained using the
ripple-precharge CAM cells is summarized in Table III. forwarding table taken from the AS1221 edge router on June
10, 2004 and has 168,178 active IPv4 route prefixes [14] and
B. Circuit Behavior the packet trace from the main router of a national laboratory
The novel RP-TCAM architecture consists of both ripple- used also in [15]. The objective behind this simulation is to
precharge CAM bits and conventional TCAM bits as shown in find out the percentage of prefix mismatch corresponding to N
Figure 4. The first four most significant bits of the RP-TCAM most significant bits of the incoming packet. The simulation
word are intentionally CAM bits and not TCAM bits. This ar- results shown in Figure 5 reflect the percentage of searches in
chitectural change can be validated from the fact that most sig- which mismatch discovered in the first N most significant bits
nificant eight bits of the packet are never masked [12]. Further, (100α%). N 4 is a very good choice because by just 4 most
the elimination of mask bit from the first four most significant significant bits of the packets destination address we can deter-
bits of the TCAM is justified from the results of packet profil- mine 80% of the prefix mismatches. We did not choose N 3
ing [13]. This is mainly due to hierarchical structure of internet because the percentage of prefix mismatch was only 60% for
protocol (IP) address allocation in classless inter-domain rout- 3 bits when compared to 80% for N 4. On the other hand,
ing (CIDR). Therefore, the first four most significant bits of the choosing N 4 is not advisable because we will have perfor-
proposed 32-bit RP-TCAM word are CAM bits and the least mance penalty as the search time will increase.
significant twenty eight bits are TCAM bits. RP-TCAM archi-
tecture initially evaluates the first four most significant bits of C. Power Analysis
the search key serially. If there is an exact match between the Dynamic power consumption is given by Pdyn 1
2 Cload
current data bit stored in the SRAM and the corresponding com- f Vdd2 , where, C is the load capacitance switched, Vdd is
load
parand bit sent through CMP and CMPB lines, the Vdd ripples the supply voltage and f is the frequency of transitions on that
through the current bit to evaluate the next bit. If there is a mis- particular node. In our application, for n word memory modules
match in any of the first four bits of the ripple-precharge CAM we have:
a 0 is propagated to the next bit from current mismatched bit.
The match line in the second part is charged to Vdd only when
there is a match in the first four CAM bits.  PTCAM  n PML n 1
2
2
CML f TCAM Vdd
Whenever there is a mismatch in any one of the three most
significant bits of the RP-TCAM, a 0 is propagated to the next
 PRP TCAM  n PML n 1
CML f RP TCAM
2
Vdd
(1)
2
Proceedings of the 2005 International Conference on Computer Design (ICCD’05)
0-7695-2451-6/05 $20.00 © 2005 IEEE
ML(RP) NAME LP/PS DB RR MD

110.0 ML(RP) 0 0 0
ML(CONV) 4 0 0

100.0

90.0

80.0
Mismatch for N bits (%)

70.0

60.0

50.0
ML(CONV)

40.0

30.0

20.0
0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 11.0 12.0 13.0 14.0 15.0 16.0 17.0
Number of bits (N)

Figure 5. Percentage of mismatch (100α%) when N most significant bits are


used.

0.3 10.3 20.3 30.3 40.3 50.3 60.3 70.3 80.3 90.3 100.3
NANO SECONDS
Suppose α is the fraction of searches that lead to a mismatch
and found through checking the first 4 bits. Based on what we
discussed so far, in RP-TCAM architecture we expect this frac- Figure 6. Signal waveform on the match line for RP-TCAM (top curve) and
tion to be large which means that on average 100α% of searches TCAM (bottom curve).
4
experience 32 f TCAM transitions and the rest see all-bit transi-
Average Power
tions in a word. In other words: 0.0

α  1  α   f TCAM 1  α  f TCAM
4 32 7 −0.05
f RP TCAM  
32 32 8 −0.1

The total saving that RP-TCAM architecture that will achieve −0.15 (95.179n, −0.0038481)

will be: (95.152n, −0.015902)


Power (Watts)

−0.2

PTCAM  PRP f TCAM  f RP


−0.25

TCAM TCAM 7
∆P    α (2) −0.3
PTCAM f TCAM 8
−0.35

As explained in the previous section, in most practical cases


of network search engines, 0 8  α  0 9 (see Figure 5). This
−0.4

Power (Watts)
: Time (s)

is equivalent to power saving (∆P%) of 70.0% to 78.8%. −0.45 TCAM

RP−TCAM

−0.5
0.0 10n 20n 30n 40n 50n 60n 70n 80n 90n 100n

IV. I MPLEMENTATION AND R ESULTS Time (s)

A. Layout Implementation Figure 7. Average power consumed by TCAM and RP-TCAM architectures.
The layouts of the proposed 32-bit RP-TCAM architecture as
well as the conventional TCAM architecture were drawn using The layouts for 64  32 bit RP-TCAM block and conven-
Cadence tools using 0 18µm Digital CMOS process [16]. The tional TCAM block were drawn and are shown in 8. It’s obvi-
supply voltage was fixed at Vdd  1 8V . The parasitics were ous that there is no area increase in RP-TCAM. Table IV shows
extracted for both the layouts and the circuit simulation was the comparison of time average power, area and search time
done using SPICE3 [17]. The resulting waveform of the match of the conventional TCAM and proposed RP-TCAM architec-
line for 32-bit RP-TCAM and TCAM word are shown in Fig- tures. A negative number in the last column shows improve-
ure 6. In the RP-TCAM architecture the match line (top curve) ment in area and power. We can clearly see that, even for 50
charges to Vdd only when there is an exact match between the searches, the average power consumed by the RP-TCAM ar-
TCAM word entry and the input comparand, unlike the con- chitecture is found to be almost 75.79% less than the conven-
ventional TCAM architecture wherein the match line (bottom tional architecture. RP-TCAM has an area saving of 1.71%
curve) precharges during every search operation and discharges with 2.76% degradation in performance.
during all but one search operation. The waveforms compar-
ing the time average power consumed by both architectures are
shown in Figure 7. This is done by applying 50 keys (longest B. Comments on Design and Implementation Issues
prefix searchers) and using transistor-level (SPICE) simulator The most significant four bits of the CAM are arranged in
[17]. a manner that they occupy the area equivalent to two TCAM
Proceedings of the 2005 International Conference on Computer Design (ICCD’05)
0-7695-2451-6/05 $20.00 © 2005 IEEE
V. C ONCLUSION
This paper presents a novel, low power TCAM architecture
used for longest prefix matching tasks in network search engine
applications. Our proposed RP-TCAM architecture implements
the idea of rippling Vdd to selectively precharge the highly ca-
pacitive match line. The condition for precharging the match
line is derived from the fact that more than 80% of mismatch
can be identified by just comparing the first four bits of the
prefixes. By exploiting this inherent property of the prefixes
we designed our RP-TCAM. Both the conventional TCAM and
proposed RP-TCAM were implemented in 0 18µm technology.
The average power consumed by the proposed RP-TCAM is
found to be 75 79% less than that of the conventional TCAM.
The area of the RP-TCAM is found to be 1 71% less than that
of conventional TCAM for equal storage size and functionality.
Our RP-TCAM architecture has a search time of 1 86ns.
ACKNOWLEDGEMENTS
This work has been supported in part by the Cisco Systems
Figure 8. Layouts of 64  32 RP-TCAM and TCAM architectures.
Academic Research and Technology Initiative Award.
TABLE IV R EFERENCES
C OMPARISON OF 64  32 CONVENTIONAL AND PROPOSED TCAM [1] C.A. Zukowski and S.Y. Wang, “Use of selective precharge for
ARCHITECTURE . low-power content-addressable memories,” In Proceedings of the
IEEE International Symposium on Circuits and Systems, Vol. 3,
pages 1788-1791, 1997.
[2] K. Pagiamtzis and A. Sheikholeslami, “Pipelined match-lines
Comparison Metric TCAM RP-TCAM Change [%] and hierarchical search-lines for low-power content-addressable
Area (mm2 ) 0.292 0.287 -1.71 memories,” In Proceedings of the IEEE Custom Integrated Cir-
Search Time (ns) 1.81 1.86 +2.76 cuits Conference, pages 383-386, 2003.
[3] I.Y.L. Hsiao, D.H. Wang, and C.W. Jen, “Power modeling and
Average Power (mW ) 15.90 3.85 -75.79 low-power design of content addressable memories,” In Proceed-
ings of the IEEE International Symposium on Circuits and Sys-
tems,Vol. 4, pages 926-929, 2001.
[4] Aristides Efthymiou and Jim D.Garside, “A CAM with Mixed
cells. Therefore, we have an area saving of two TCAM bits Serial-Parallel comparison for use in Low Energy Caches,” IEEE
when compared to the conventional TCAM architecture. The Transactions on Very Large Scale Integration Systems,Vol. 12, no.
additional overhead of an inverter and NMOS transistor to dis- 3, Mar. 2004.
[5] I. Arsovski, T. Chandler, and A. Sheikholeslami, “A Ternary
charge the match line is very negligible when compared to the Content-Addressable Memory (TCAM) Based on 4T Static Stor-
other very recent architectures [4] which has much higher area age and Including a Current-Race Sensing Scheme,” IEEE Jour-
overhead. The ripple-precharge CAM bit has PMOS transistors nal of Solid-State Circuits, vol. 38, no. 1, January 2003.
in its evaluation logic when compared to conventional TCAM [6] J. Rabaey, Digital Integrated Circuits, Prentice Hall, 1996.
which has NMOS transistors. This makes the size of evaluation [7] H. Liu, “Routing Table Compaction in Ternary CAM,” IEEE Mi-
cro, January, February 2002.
logic for that part CAM a little bigger than that of its TCAM [8] M. Akhbarizadeh, M. Nourani, D. Vijayasarathi, P. Balsara,
counterpart. Overall, we have an area saving of 1.71% com- “PCAM: A Ternary CAM Optimized for Packet Forwarding
pared to the conventional TCAM architecture. Tasks,” IEEE International Conference on Computer Design, Oc-
In Figure 6 we see that there is a small glitch in the match line tober 2004.
[9] Pei, T.-B., Zukowski, C. “Putting Routing Tables in Silicon,”
of the proposed RP-TCAM architecture. The glitch is due to the IEEE Network Magazine, January 1992, 42-49.
corner case wherein the mismatch occurs in the least significant [10] Integrated Device Technology, Inc., www.idt.com, 2005.
28 bits and not in the most significant 4 bits of the TCAM. The [11] Chris h. Kim and Kaushik Roy, “Dynamic Vt SRAM : A Leakage
effect of this mismatch is that both charging and evaluation of Tolerant Cache Memory for Low Voltage Microprocessors,” In
match line of RP-TCAM takes place at the same time. This ef- Proceedings of ISLPED’02, August 2002.
[12] RFC1519, the Internet Engineering Task Force, www.ietf.org,
fect is minimized by making the PMOS transistors of the serial 2005.
TCAM weaker i.e by decreasing their sizes at the same time not [13] Internet Performance Measurement and Analysis Project, Univ.
compromising on performance. Another important design con- of Michigan and Merit Network Inc., www.merit.edu, 2005.
straint which we face is the leakage current during a mismatch [14] bgp.potaroo.net, “BGP Routing Table Analysis Reports,” 2004.
in the first four most significant bit. During mismatch, the out- [15] T. Chiueh and P. Pradham, “Cache Memory Design for Internet
Processors,” IEEE Micro, February 2000.
put of the CAM cell (X) is Vdd -Vtn and not Vdd . This makes the [16] Cadence Design Systems Inc., “Virtuoso Layout Editor Users
PMOS transistor to leak. This issue can be addressed either by Guide - Version 4.4.6,” June 2000.
having a transmission gate logic in evaluation transistors or by [17] Texas Instruments Inc., “TI Spice3 User’s and Reference Manual
having high Vt transistors. - Version 1.6,” 1994.
Proceedings of the 2005 International Conference on Computer Design (ICCD’05)
0-7695-2451-6/05 $20.00 © 2005 IEEE
View publication stats

You might also like