ECC in DDR Memories - DesignWare IP - Synopsys
ECC in DDR Memories - DesignWare IP - Synopsys
(https://www.synopsys.com)
() () ()
Home (/) /
Synopsys IP (/designware-ip.html)
/ Technical Bulletin (/designware-ip/technical-bulletin.html)
/ Error Correction Code (ECC) in DDR Memories (/designware-ip/technical-bulletin/error-correction-
code-ddr.html)
Introduction
Double Data Rate Synchronous Dynamic Random-Access Memory (DDR SDRAM or simply DRAM)
technology is the widely used for main memory in almost all applications today, ranging from high-
performance computing (HPC) to power-, area-sensitive mobile applications. This is due to DDR’s many
advantages including high-density with a simplistic architecture, low-latency, and low-power consumption.
JEDEC, the standards organization that specifies memory standards, has defined and developed four DRAM
categories to guide designers to precisely meet their memory requirements: standard DDR (DDR5/4/3/2),
mobile DDR (LPDDR5/4/3/2), graphic DDR (GDDR3/4/5/6), and high bandwidth DRAM (HBM2/2E/3). Figure
1 shows a high-level block diagram of a memory subsystem in a typical system-on-chip (SoC), which
comprises of a DDR memory controller, DDR PHY, DDR channel, and DDR memory. As per JEDEC' definition,
the DDR channel is composed of Command/Address and data lanes. The simplified DDR memory shown
below can represent a DRAM memory component from any of the four categories.
(https://www.synopsys.com)
() () ()
As with any electronic system, errors in the memory subsystem are possible due to design failures/defects
or electrical noise in any one of the components. These errors are classified as either hard-errors (caused by
design failures) or soft-errors (caused by system noise or memory array bit flips due to alpha particles, etc.).
As the names suggest, hard-errors are permanent and soft-errors are transient in nature. Although it is
logical to expect the DRAMs (with large memory arrays and getting denser with every standards generation
for a smaller process node) to be the bulk source of the memory errors, an end-to-end protection from the
controller to the DRAMs is highly desirable for overall memory subsystem robustness.
To handle these memory errors during runtime, the memory subsystem must have advanced RAS
(Reliability, Availability, and Serviceability) features to prolong the overall system uptime at times of memory
errors. Without RAS features, the system will most likely crash due to memory errors. However, RAS features
allow the system to continue operating when there are correctable errors, while logging the uncorrectable
error details for future debugging purposes.
The ECC codes are generated by the controller based on the actual WR (WRITE) data. The memory
stores both the WR data and the ECC code.
During a RD (READ) operation, the controller reads both the data and respective ECC code from the
memory. The controller regenerates the ECC code from the received data and compares it against the
received ECC code.
If there is a match, then no errors have occurred. If there are mismatches, the ECC SECDED
mechanism (https://www.synopsys.com)
allows the controller to correct any single-bit error and detect double-bit errors.
() () ()
Such an ECC scheme provides an end-to-end protection against single-bit errors that can occur anywhere in
the memory subsystem between the controller and the memory.
Based on the actual storage of the ECC codes, the ECC scheme can be of two types: side-band ECC or inline
ECC. In side-band ECC, the ECC codes are stored on separate DRAMs and in inline ECC, the codes are stored
on the same DRAMs with the actual data.
As DDR5 and LPDDR5 support much higher data-rates than their predecessors, they support additional ECC
features for enhancing the robustness of the memory subsystem. On-die ECC in DDR5 and Link-ECC in
LPDDR5 are two such RAS schemes to further bolster the memory subsystem RAS capabilities.
The side-band ECC scheme is typically implemented in applications using standard DDR memories (such as
DDR4 and DDR5). As the name illustrates, the ECC code is sent as side-band data along with the actual data
to memory. For instance, for a 64-bit data width, 8 additional bits are used for ECC storage. Hence, the DDR4
ECC DIMMs, commonly used in today’s enterprise class servers and data centers, are 72 bits wide. These
DIMMs have two additional x4 DRAMs or a single x8 DRAM for the additional 8 bits of ECC storage. Hence, in
side-band ECC, the controller writes and reads the ECC code along with the actual data. No additional WR or
RD overhead commands are required for this ECC scheme. Figure 2 describes the WR and RD operation
flows with side-band ECC. When there are no errors in the received data, side-band ECC incurs minimal
latency penalty as compared to inline ECC.
(https://www.synopsys.com)
() () ()
Inline ECC
The inline ECC scheme is typically implemented in applications using LPDDR memories. As the LPDDR
DRAMs have a fixed-channel width (16-bits for a LPDDR5/4/4X channel), side-band ECC becomes an
expensive solution with these memories. For instance, for a 16-bit data-width, an additional 16-bit LPDDR
channel needs to be allocated for side-band ECC for the 7 or 8-bit ECC code-word. Moreover, the 7- or 8-bit
ECC code-word fills the 16-bit additional channel only partially, resulting in storage inefficiency and also
adding extra load to the address command channel, possibly limiting performance. Hence, Inline ECC
becomes a better solution for LPDDR memories.
Instead of requiring an additional channel for ECC storage, the controller in inline ECC stores the ECC code in
the same DRAM channel where the actual data is stored. Hence, the overall data-width of the memory
channel remains the same as the actual data-width.
In Inline ECC, the 16-bit channel memory is partitioned such that a dedicated fraction of the memory is
allocated to ECC code storage. When the ECC code is not sent along with the WR and RD data, the controller
generates separate overhead WR and RD commands for ECC codes. Hence, every WR and RD command for
the actual data is accompanied with an overhead WR and RD command respectively for the ECC data. High-
performance controllers reduce the penalty of such overhead ECC commands by packing the ECC data of
several consecutive addresses in one overhead ECC WR command. Similarly, the controller reads the ECC
data of several consecutive addresses from memory in one overhead ECC RD command and can apply the
https://www.synopsys.com/designware-ip/technical-bulletin/error-correction-code-ddr.html#:~:text=ECC as a Memory RAS Feature&text=By generatin… 4/9
12/8/22, 2:24 PM ECC in DDR Memories | DesignWare IP | Synopsys
read-out ECC data to the actual data from the consecutive addresses. Hence, the more sequential the traffic
(https://www.synopsys.com)
pattern is, the latency penalty is less due to such ECC overhead commands. Figure 3 describes the WR and
() () ()
RD operation flows with inline ECC.
On-die ECC
With each DDR generation, it's common for the DRAM capacity to increase. It is also common for DRAM
vendors to shrink the process technology to achieve both higher speeds and better economies of scale in
production. With the higher capacity and speed coupled with the smaller process technology, the likelihood
of single-bit errors increases on the DRAM memory arrays. To further bolster the memory channel, DDR5
DRAMs have additional storage just for the ECC storage. On-die ECC is an advanced RAS feature that the
DDR5 system can enable for higher speeds. For every 128 bits of data, DDR5 DRAMs has 8 additional bits for
ECC storage.
The DRAMs internally compute the ECC for the WR data and store the ECC code in the additional storage. On
a read operation, the DRAMs read out both the actual data as well as the ECC code and can correct any
single-bit error on any of the read data bits. Hence, on-die ECC provides further protection against single-bit
errors inside the DDR5 memory arrays. As this scheme does not offer any protection against errors
occurring on the DDR channel, on-die ECC is used in conjunction with side-band ECC for enhanced end-to-
end RAS on memory subsystems. Figure 4 describes the WR and RD operation flows with on-die ECC.
(https://www.synopsys.com)
() () ()
Link-ECC
The Link-ECC scheme is a LPDDR5 feature that offers protection against single-bit errors on the LPDDR5 link
or channel. The memory controller computes the ECC for the WR data and sends the ECC on specific bits
along with the data. The DRAM generates the ECC on the received data, checks it against the received ECC
data, and corrects any single-bit errors. The roles of the controller and the DRAM are reversed for the read
operation. Note that link-ECC does not offer any protection against single-bit errors on the memory array.
However, inline ECC coupled with link-ECC strengthens the robustness of LPDDR5 channels by providing an
end-to-end protection against single-bit errors. Figure 5 describes the WR and RD operation flows with link-
ECC.
(https://www.synopsys.com)
() () ()
Conclusion
One of the widely used memory RAS features is the Error Correction Code (ECC) scheme. Applications using
standard DDR memories typically implement side-band ECC, while applications using LPDDR memories
implement inline ECC. With the higher speeds and hence pronounced SI effects on DDR5 and LPDDR5
channels, ECC is now supported even on DDR5 and LPDDR5 DRAMs in the form of on-die and link-ECC
respectively. Synopsys’ DesignWare® DDR5/4 and LPDDR5/4 IP solutions offer advanced RAS features
including all of the ECC schemes highlighted in this article.
Read(https://www.synopsys.com)
more articles () () ()
(/designware-ip/technical-bulletin.html)
(https://www.synopsys.com/)
Corporate Headquarters
690 East Middlefield Road
Mountain View, CA 94043
(https://www.google.com/maps/search/?api=1&query=690+East+Middlefield+Road+)
Customer Support
650-584-5000 (tel:650-584-5000)
800-541-7737 (tel:800-541-7737)
Worldwide Location
View our office locations (/company/contact-synopsys/office-locations.html)
Products Resources
Application Security (/software-integrity.html) Solutions (/solutions.html)
research.html)
Manage Subscriptions
(https://online.synopsys.com/contact-form-
subscription-center.html)
Legal Corporate
Privacy (/company/legal/privacy-policy.html) About Us (/company.html)
Careers (/careers.html)
(/company/legal/software-integrity.html) diversity.html#present)
Contact Us (/company/contact-synopsys.html)
Follow
(https://twitter.com/synopsys)
(https://www.linkedin.com/company/synopsys)
(https://www.facebook.com/Synopsys/)
(https://www.youtube.com/user/synopsys)