Synchronous Dynamic Random-Access Memory (SDRAM) Is Any
Synchronous Dynamic Random-Access Memory (SDRAM) Is Any
1SDRAM history
2SDRAM timing
3SDR SDRAM
o
3.1.1Command signals
3.1.3Addressing (A10/An)
3.1.4Commands
3.3Command interactions
3.6Auto refresh
4Generations of SDRAM
o
4.2DDR(1) SDRAM
4.3DDR2 SDRAM
4.4DDR3 SDRAM
4.5DDR4 SDRAM
5Failed successors
o
6See also
7References
8External links
SDRAM history[edit]
Although the concept of synchronous DRAM was well understood by the 1970s and was used with
early Intel processors, it was only in 1993 that SDRAM began its path to universal acceptance in the
electronics industry. In 1993, Samsung introduced its KM48SL2000 synchronous DRAM, and by
2000, SDRAM had replaced virtually all other types of DRAM in modern computers, because of its
greater performance.
SDRAM latency is not inherently lower (faster) than asynchronous DRAM. Indeed, early SDRAM
was somewhat slower than contemporaneous burst EDO DRAM due to the additional logic. The
benefits of SDRAM's internal buffering come from its ability to interleave operations to multiple banks
of memory, thereby increasing effective bandwidth.
Today, virtually all SDRAM is manufactured in compliance with standards established by JEDEC, an
electronics industry association that adopts open standards to facilitate interoperability of electronic
components. JEDEC formally adopted its first SDRAM standard in 1993 and subsequently adopted
other SDRAM standards, including those for DDR, DDR2 and DDR3 SDRAM.
SDRAM is also available in registered varieties, for systems that require greater scalability such
as servers and workstations.
Today, the world's largest manufacturers of SDRAM include: Samsung
Electronics, Panasonic, Micron Technology, and Hynix.
SDRAM timing[edit]
There are several limits on DRAM performance. Most noted is the read cycle time, the time between
successive read operations to an open row. This time decreased from 10 ns for 100 MHz SDRAM to
5 ns for DDR-400, but has remained relatively unchanged through DDR2-800 and DDR3-1600
generations. However, by operating the interface circuitry at increasingly higher multiples of the
fundamental read rate, the achievable bandwidth has increased rapidly.
Another limit is the CAS latency, the time between supplying a column address and receiving the
corresponding data. Again, this has remained relatively constant at 1015 ns through the last few
generations of DDR SDRAM.
In operation, CAS latency is a specific number of clock cycles programmed into the SDRAM's mode
register and expected by the DRAM controller. Any value may be programmed, but the SDRAM will
not operate correctly if it is too low. At higher clock rates, the useful CAS latency in clock cycles
naturally increases. 1015 ns is 23 cycles (CL23) of the 200 MHz clock of DDR-400 SDRAM,
CL4-6 for DDR2-800, and CL8-12 for DDR3-1600. Slower clock cycles will naturally allow lower
numbers of CAS latency cycles.
SDRAM modules have their own timing specifications, which may be slower than those of the chips
on the module. When 100 MHz SDRAM chips first appeared, some manufacturers sold "100 MHz"
modules that could not reliably operate at that clock rate. In response, Intel published
the PC100 standard, which outlines requirements and guidelines for producing a memory module
that can operate reliably at 100 MHz. This standard was widely influential, and the term "PC100"
quickly became a common identifier for 100 MHz SDRAM modules, and modules are now commonly
designated with "PC"-prefixed numbers (PC66, PC100 or PC133 - although the actual meaning of
the numbers has changed).
SDR SDRAM[edit]
The 64 MB of sound memory on the Sound Blaster X-Fi Fatality Pro sound card is built from
two Micron 48LC32M8A2 SDRAM chips. They run at 133 MHz (7.5 ns clock period) and have 8-bit wide data
buses.[1]
Originally simply known as SDRAM, single data rate SDRAM can accept one command and
transfer one word of data per clock cycle. Typical clock frequencies are 100 and 133 MHz. Chips are
made with a variety of data bus sizes (most commonly 4, 8 or 16 bits), but chips are generally
assembled into 168-pin DIMMs that read or write 64 (non-ECC) or 72 (ECC) bits at a time.
Use of the data bus is intricate and thus requires a complex DRAM controller circuit. This is because
data written to the DRAM must be presented in the same cycle as the write command, but reads
produce output 2 or 3 cycles after the read command. The DRAM controller must ensure that the
data bus is never required for a read and a write at the same time.
Typical SDR SDRAM clock rates are 66, 100, and 133 MHz (periods of 15, 10, and 7.5 ns). Clock
rates up to 200 MHz were available.
CKE Clock Enable. When this signal is low, the chip behaves as if
the clock has stopped. No commands are interpreted and command
latency times do not elapse. The state of other control lines is not
relevant. The effect of this signal is actually delayed by one clock
cycle. That is, the current clock cycle proceeds as usual, but the
following clock cycle is ignored, except for testing the CKE input
again. Normal operations resume on the rising edge of the clock
after the one where CKE is sampled high.
Put another way, all other chip operations are timed relative to the
rising edge of a masked clock. The masked clock is the logical AND
of the input clock and the state of the CKE signal during the
previous rising edge of the input clock.
CS Chip Select. When this signal is high, the chip ignores all other
inputs (except for CKE), and acts as if a NOP command is received.
the chip. There is one DQM line per 8 bits on a x16 memory chip or
DIMM.
Command signals[edit]
RAS Row Address Strobe. Despite the name, this is not a strobe,
but rather simply a command bit. Along with CAS and WE, this
selects one of 8 commands.
WE Write enable. Along with RAS and CAS, this selects one of 8
commands. This generally distinguishes read-like commands from
write-like commands.
CS RAS CAS WE
BA
n
A10
An
Command
No operation
bank
column Read: Read a burst of data from the currently active row.
bank
column
bank
bank
column
bank
bank
00
row
mode
All SDRAM generations (SDR and DDRx) use essentially the same commands, with the changes
being:
DDR3 and DDR4 use A12 during read and write command to
indicate "burst chop", half-length data transfer
Command interactions[edit]
The no operation command is always permitted, while the load mode register command requires
that all banks be idle, and a delay afterward for the changes to take effect. The auto refresh
command also requires that all banks be idle, and takes a refresh cycle time t RFC to return the chip to
the idle state. (This time is usually equal to tRCD+tRP.) The only other command that is permitted on an
idle bank is the active command. This takes, as mentioned above, tRCD before the row is fully open
and can accept read and write commands.
When a bank is open, there are four commands permitted: read, write, burst terminate, and
precharge. Read and write commands begin bursts, which can be interrupted by following
commands.
Interrupting a read burst[edit]
A read, burst terminate, or precharge command may be issued at any time after a read command,
and will interrupt the read burst after the configured CAS latency. So if a read command is issued on
cycle 0, another read command is issued on cycle 2, and the CAS latency is 3, then the first read
command will begin bursting data out during cycles 3 and 4, then the results from the second read
command will appear beginning with cycle 5.
If the command issued on cycle 2 were burst terminate, or a precharge of the active bank, then no
output would be generated during cycle 5.
Although the interrupting read may be to any active bank, a precharge command will only interrupt
the read burst if it is to the same bank or all banks; a precharge command to a different bank will not
interrupt a read burst.
To interrupt a read burst by a write command is possible, but more difficult. It can be done, if the
DQM signal is used to suppress output from the SDRAM so that the memory controller may drive
data over the DQ lines to the SDRAM in time for the write operation. Because the effects of DQM on
read data are delayed by 2 cycles, but the effects of DQM on write data are immediate, DQM must
be raised (to mask the read data) beginning at least two cycles before write command, but must be
lowered for the cycle of the write command (assuming the write command is intended to have an
effect).
Doing this in only two clock cycles requires careful coordination between the time the SDRAM takes
to turn off its output on a clock edge and the time the data must be supplied as input to the SDRAM
for the write on the following clock edge. If the clock frequency is too high to allow sufficient time,
three cycles may be required.
If the read command includes auto-precharge, the precharge begins the same cycle as the
interrupting command.
For the sequential burst mode, later words are accessed in increasing address order, wrapping back
to the start of the block when the end is reached. So, for example, for a burst length of four, and a
requested column address of five, the words would be accessed in the order 5-6-7-4. If the burst
length were eight, the access order would be 5-6-7-0-1-2-3-4. This is done by adding a counter to
the column address, and ignoring carries past the burst length. The interleaved burst mode
computes the address using an exclusive or operation between the counter and the address. Using
the same starting address of five, a four-word burst would return words in the order 5-4-7-6. An
eight-word burst would be 5-4-7-6-1-0-3-2.[2] Although more confusing to humans, this can be easier
to implement in hardware, and is preferred by Intel for its microprocessors.[citation needed]
If the requested column address is at the start of a block, both burst modes (sequential and
interleaved) return data in the same sequential sequence 0-1-2-3-4-5-6-7. The difference only
matters if fetching a cache line from memory in critical-word-first order.
Auto refresh[edit]
It is possible to refresh a RAM chip by opening and closing (activating and precharging) each row in
each bank. However, to simplify the memory controller, SDRAM chips support an "auto refresh"
command, which performs these operations to one row in each bank simultaneously. The SDRAM
also maintains an internal counter, which iterates over all possible rows. The memory controller must
simply issue a sufficient number of auto refresh commands (one per row, 4096 in the example we
have been using) every refresh interval (tREF = 64 ms is a common value). All banks must be idle
(closed, precharged) when this command is issued.
as the data bus. Prefetch architecture simplifies this process by allowing a single address request to
result in multiple data words.
In a prefetch buffer architecture, when a memory access occurs to a row the buffer grabs a set of
adjacent data words on the row and reads them out ("bursts" them) in rapid-fire sequence on the IO
pins, without the need for individual column address requests. This assumes the CPU wants
adjacent datawords in memory, which in practice is very often the case. For instance, in DDR1, two
adjacent data words will be read from each chip in the same clock cycle and placed in the pre-fetch
buffer. Each word will then be transmitted on consecutive rising and falling edges of the clock cycle.
Similarly, in DDR2 with a 4n pre-fetch buffer, four consecutive data words are read and placed in
buffer while a clock, which is twice faster than the external clock of DDR, transmits each of the word
in consecutive rising and falling edge of the faster external clock [3]
The prefetch buffer depth can also be thought of as the ratio between the core memory frequency
and the IO frequency. In an 8n prefetch architecture (such as DDR3), the IOs will operate 8 times
faster than the memory core (each memory access results in a burst of 8 datawords on the IOs).
Thus a 200 MHz memory core is combined with IOs that each operate eight times faster (1600
megabits per second). If the memory has 16 IOs, the total read bandwidth would be 200 MHz x 8
datawords/access x 16 IOs = 25.6 gigabits per second (Gbit/s), or 3.2 gigabytes per second (GB/s).
Modules with multiple DRAM chips can provide correspondingly higher bandwidth.
Each generation of SDRAM has a different prefetch buffer size:
Generations of SDRAM[edit]
SDRAM feature map
Type
Feature changes
SDRAM
Vcc = 3.3 V
Signal: LVTTL
DDR1
Access is 2 words
Double clocked
Vcc = 2.5 V
2.5 - 7.5 ns per cycle
Signal: SSTL_2 (2.5V)[4]
DDR2
Access is 4 words
"Burst terminate" removed
4 units used in parallel
Access is 8 words
Signal: SSTL_15 (1.5V)[4]
Much longer CAS latencies
DDR4
DDR(1) SDRAM[edit]
Main article: DDR SDRAM
While the access latency of DRAM is fundamentally limited by the DRAM array, DRAM has very high
potential bandwidth because each internal read is actually a row of many thousands of bits. To make
more of this bandwidth available to users, a double data rate interface was developed. This uses the
same commands, accepted once per cycle, but reads or writes two words of data per clock cycle.
The DDR interface accomplishes this by reading and writing data on both the rising and falling edges
of the clock signal. In addition, some minor changes to the SDR interface timing were made in
hindsight, and the supply voltage was reduced from 3.3 to 2.5 V. As a result, DDR SDRAM is not
backwards compatible with SDR SDRAM.
DDR SDRAM (sometimes called DDR1 for greater clarity) doubles the minimum read or write unit;
every access refers to at least two consecutive words.
Typical DDR SDRAM clock rates are 133, 166 and 200 MHz (7.5, 6, and 5 ns/cycle), generally
described as DDR-266, DDR-333 and DDR-400 (3.75, 3, and 2.5 ns per beat). Corresponding 184pin DIMMs are known as PC-2100, PC-2700 and PC-3200. Performance up to DDR-550 (PC-4400)
is available for a price.
DDR2 SDRAM[edit]
Main article: DDR2 SDRAM
DDR2 SDRAM is very similar to DDR SDRAM, but doubles the minimum read or write unit again, to
4 consecutive words. The bus protocol was also simplified to allow higher performance operation. (In
particular, the "burst terminate" command is deleted.) This allows the bus rate of the SDRAM to be
doubled without increasing the clock rate of internal RAM operations; instead, internal operations are
performed in units 4 times as wide as SDRAM. Also, an extra bank address pin (BA2) was added to
allow 8 banks on large RAM chips.
Typical DDR2 SDRAM clock rates are 200, 266, 333 or 400 MHz (periods of 5, 3.75, 3 and 2.5 ns),
generally described as DDR2-400, DDR2-533, DDR2-667 and DDR2-800 (periods of 2.5, 1.875, 1.5
and 1.25 ns). Corresponding 240-pin DIMMS are known as PC2-3200 through PC2-6400. DDR2
SDRAM is now available at a clock rate of 533 MHz generally described as DDR2-1066 and the
corresponding DIMMs are known as PC2-8500 (also named PC2-8600 depending on the
manufacturer). Performance up to DDR2-1250 (PC2-10000) is available for a price.
Note that because internal operations are at 1/2 the clock rate, DDR2-400 memory (internal clock
rate 100 MHz) has somewhat higher latency than DDR-400 (internal clock rate 200 MHz).
DDR3 SDRAM[edit]
DDR4 SDRAM[edit]
Main article: DDR4 SDRAM
DDR4 SDRAM is the successor to DDR3 SDRAM. It was revealed at the Intel Developer
Forum in San Francisco in 2008, and was due to be released to market during 2011. The timing has
varied considerably during its development - it was originally expected to be released in 2012, [10] and
later (during 2010) expected to be released in 2015, [11] before samples were announced in early 2011
and manufacturers began to announce that commercial production and release to market was
anticipated in 2012. DDR4 is expected to reach mass market adoption around 2015, which is
comparable with the approximately 5 years taken for DDR3 to achieve mass market transition over
DDR2.
The new chips are expected to run at 1.2 V or less,[12][13] versus the 1.5 V of DDR3 chips, and have in
excess of 2 billion data transfers per second. They are expected to be introduced at frequency rates
of 2133 MHz, estimated to rise to a potential 4266 MHz[14] and lowered voltage of 1.05 V[15] by 2013.
DDR4 will not double the internal prefetch width again, but will use the same 8n prefetch as DDR3.
[16]
Thus, it will be necessary to interleave reads from several banks to keep the data bus busy.
In February 2009, Samsung validated 40 nm DRAM chips, considered a "significant step" towards
DDR4 development[17] since as of 2009, current DRAM chips were only beginning to migrate to a
50 nm process.[18] In January 2011, Samsung announced the completion and release for testing of a
30 nm 2 GB DDR4 DRAM module. It has a maximum bandwidth of 2.13 Gbit/s at 1.2 V, uses pseudo
open drain technology and draws 40% less power than an equivalent DDR3 module. [19][20]
Failed successors[edit]
In addition to DDR, there were several other proposed memory technologies to succeed SDR
SDRAM.
SLDRAM boasted higher performance and competed against RDRAM. It was developed during the
late 1990s by the SLDRAM Consortium. The SLDRAM Consortium consisted of about 20 major
DRAM and computer industry manufacturers. (The SLDRAM Consortium became incorporated as
SLDRAM Inc. and then changed its name to Advanced Memory International, Inc.). SLDRAM was
an open standard and did not require licensing fees. The specifications called for a 64-bit bus
running at a 200, 300 or 400 MHz clock frequency. This is achieved by all signals being on the same
line and thereby avoiding the synchronization time of multiple lines. Like DDR SDRAM, SLDRAM
uses a double-pumped bus, giving it an effective speed of 400, [21] 600,[22] or 800 MT/s.
SLDRAM used an 11-bit command bus (10 command bits CA9:0 plus one start-of-command FLAG
line) to transmit 40-bit command packets on 4 consecutive edges of a differential command clock
(CCLK/CCLK#). Unlike SDRAM, there were no per-chip select signals; each chip was assigned an
ID when reset, and the command contained the ID of the chip that should process it. Data was
transferred in 4- or 8-word bursts across an 18-bit (per chip) data bus, using one of two differential
data clocks (DCLK0/DCLK0# and DCLK1/DCLK1#). Unlike standard SDRAM, the clock was
generated by the data source (the SLDRAM chip in the case of a read operation) and transmitted in
the same direction as the data, greatly reducing data skew. To avoid the need for a pause when the
source of the DCLK changes, each command specified which DCLK pair it would use. [23]
The basic read/write command consisted of (beginning with CA9 of the first word):
SLDRAM Read, write or row op request packet
FLAG
CA9
ID8
CA8
CA7
CA5
CA4
CA3
CA2
Device ID
Command code
CA6
CMD0
Bank
Row (continued)
CA1
CA0
ID0
CMD5
Row
Column
9 bits of device ID
6 bits of command
Individual devices had 8-bit IDs. The 9th bit of the ID sent in commands was used to address
multiple devices. Any aligned power-of-2 sized group could be addressed. If the transmitted msbit
was set, all least-significant bits up to and including the least-significant 0 bit of the transmitted
address were ignored for "is this addressed to me?" purposes. (If the ID8 bit is actually considered
less significant than ID0, the unicast address matching becomes a special case of this pattern.)
A read/write command had the msbit clear:
CMD5=0
A notable omission from the specification was per-byte write enables; it was designed for systems
with caches and ECC memory, which always write in multiples of a cache line.
Additional commands (with CMD5 set) opened and closed rows without a data transfer, performed
refresh operations, read or wrote configuration registers, and performed other maintenance
operations. Most of these commands supported an additional 4-bit sub-ID (sent as 5 bits, using the
same multiple-destination encoding as the primary ID) which could be used to distinguish devices
that were assigned the same primary ID because they were connected in parallel and always
read/written at the same time.
There were a number of 8-bit control registers and 32-bit status registers to control various device
timing parameters.
To read from VCSDRAM, after the Active command, a "Prefetch" command is required to copy data
from the sense amplifier array to the channel SDRAM. This command specifies a bank, 2 bits of
column address (to select the segment of the row), and 4 bits of channel number. Once this is
performed, the DRAM array may be precharged while read commands to the channel buffer
continue. To write, first the data is written to a channel buffer (typically previous initialized using a
Prefetch command), then a Restore command, with the same parameters as the Prefetch command,
copies a segment of data from the channel to the sense amplifier array.
Unlike a normal SDRAM write, which must be performed to an active (open) row, the VCSDRAM
bank must be precharged (closed) when the Restore command is issued. An Active command
immediately after the Restore command specifies the DRAM row completes the write to the DRAM
array. There is, in addition, a 17th "dummy channel" which allows writes to the currently open row. It
may not be Read from, but may be Prefetched to, Written to, and Restored to the sense amplifier
array.[24][25]
Although normally a segment is Restored to the same memory address as it was Prefetched from,
the channel buffers may also be used for very efficient copying or clearing of large, aligned memory
blocks. (The use of quarter-row segments is driven by the fact that DRAM cells are narrower than
SRAM cells. The SRAM bits are designed to be 4 DRAM bits wide, and are conveniently connected
to one of the 4 DRAM bits they straddle.) Additional commands prefetch a pair of segments to a pair
of channels, and an optional command combines prefetch, read, and precharge to reduce the
overhead of random reads.
Virtual Channel SDRAM commands[26]
CS RAS CAS WE
BA
A12
11
A10 A9
A8
A7 A6 A5
A4
2
A10
Command
Command
inhibit (No
operation)
No operation
bank channel
bank
bank channel
AP
channel
Prefetch
(autosegment
precharge if
AP=H)
segment
AP
channel
Prefetch to
dummy
precharge if
AP=H)
bank channel
bank
AP
channel
Precharge
bank
Precharge all
banks
channel
channel
column
Read channel
channel
channel
column
Write channel
column
Write dummy
channel
(auto-restore
if AR=H)
bank
AR
Restore
(autosegment
precharge if
AP=H)
row
channel
seg
channel
Bank activate
column
Prefetch read
with autoprecharge
(optional)
seg
Auto refresh
reg
Mode register
set
The above are the JEDEC-standardized commands. Earlier chips did not support the dummy
channel or pair prefetch, and used a different encoding for precharge.
A 13-bit address bus, as illustrated here, is suitable for a device up to 128 Mbit. It would have two
banks, each containing 8192 rows and 8192 columns. Thus, row addresses are 13 bits, segment
addresses are 2 bits, and 8 column address bits are required to select one byte from the 2048 bits
(256 bytes) in a segment.