A Single Error Correcting Code With One-Step Group
A Single Error Correcting Code With One-Step Group
Abstract: Technology scaling has led to an increase in density and capacity of on-chip caches. This
has enabled higher throughput by enabling more low latency memory transfers. With the reduction
in size of SRAMs and development of emerging technologies, e.g., STT-MRAM, for on-chip cache
memories, reliability of such memories becomes a major concern. Traditional error correcting codes,
e.g., Hamming codes and orthogonal Latin square codes, either suffer from high decoding latency,
which leads to lower overall throughput, or high memory overhead. In this paper, a new single
error correcting code based on a shared majority voting logic is presented. The proposed codes trade
off decoding latency in order to improve the memory overhead posed by orthogonal Latin square
codes. A latency optimization technique is also proposed which lowers the decoding latency by
incurring a slight memory overhead. It is shown that the proposed codes achieve better redundancy
compared to orthogonal Latin square codes. The proposed codes are also shown to achieve lower
decoding latency compared to Hamming codes. Thus, the proposed codes achieve a balanced trade-
off between memory overhead and decoding latency, which makes them highly suitable for on-chip
cache memories which have stringent throughput and memory overhead constraints.
Keywords: error correcting codes; single error correction; orthogonal Latin square codes; Hamming
codes; shared majority vote; cache; memories; group partitioning
1. Introduction
As technology scales further, the size and demand of high capacity on-chip cache memory also
increases. More products are adopting error correcting codes (ECC) in order to protect these
memories against soft errors. The vulnerability of SRAM caches to transient or soft errors grows with
increase in cache size [1]. Research has shown that on-chip caches, e.g., L1 cache, are also vulnerable
to soft errors with the increase in number of processor cores [2]. These soft errors can result from
cosmic radiation strikes [3] or can also be a result of process variations and defects such as unformed
vias. With technology scaling, the main goal has been to push towards higher frequency and lower
power. This translates to more reliability concerns and the need for faster addressing of such
concerns. Recent developments suggest leading manufacturers are using caches as large as 16 MB [4].
The emergence of newer technologies, e.g., spin transfer torque magnetic random-access memory
(STT-MRAM), show that such technologies can provide comparable latencies to enable high speed
data transfer [5] and lead to lower power consumption for on-chip cache memories.
The use of one-step decoding error correcting codes is prevalent in this domain since they offer
the advantages of low latency error correction. The essential limit in this case, is the larger word size
and higher frequencies. With the larger size, more bits need to be protected, and in the case of a single
error, more bits will need to be processed by the decoding circuit. This would result in higher latency
due to increase in logic depth. However, with the push towards higher frequency, the decoding
latency needs to be reduced to enable high throughput. Thus, the two requirements are in conflict
each other and a balance is required depending on the application.
Orthogonal Latin square (OLS) codes [6] are a class of majority logic decodable codes which
offer very low latency decoding based on a majority vote. They have been successfully used in caches
to enable reliable operation [7]. These codes have also been extended to address post-manufacturing
defects while ensuring a certain level of reliability, even under ultra-low voltages, which causes high
bit error rates [8]. But the major issue with such codes is that they have very high data redundancy,
which leads to higher memory overhead. Thus, a significant portion of the cache memory is rendered
unusable, since it needs to store valuable parity information in order to correct errors.
Hamming codes are another attractive alternative for single error correction [9]. The most
prevalent use of these codes is a single error correcting double error detecting (SECDED) code. They
have the advantage of low data redundancy which leads to smaller memory overhead. However,
these codes have higher decoding latencies compared to OLS codes due to their syndrome matching-
based decoding, which has higher logic depth. However, as technology scales further and more
products try to push the frequency limit of operation, the latency of Hamming codes sometimes
becomes a limiting factor. This is especially true for cache memories, which require very low latency
decoding in order to enable good throughput.
Over the years, numerous research works have been proposed related to SRAMs and on-chip
caches. A new class of multiple-bit upset error correcting codes is proposed in [10]. Though these
codes can correct multiple adjacent bits, the latency is higher than traditional single error correcting
(SEC) Hamming codes, which can lead to reduced performance. An ultrafast single error correcting
code which achieves very low decoding latencies is proposed in [11]. However, the latency benefit
comes at the price of increased memory overhead, which is more than that of OLS codes. Unequal
error correcting schemes have also been proposed [12,13] wherein only certain special messages have
single error correction capabilities, while other messages only have single error detection capabilities.
Other architectural techniques have also been proposed to improve cache performance. A cache
architecture with variable-strength ECC is proposed in [14]. In this proposal, lines with zero or one
failures used general SECDED and stronger multi-bit ECC to protect a fraction of the cache after
switching to low voltage. A scheme to choose between regular ECC or error detection codes (EDC)
for blocks is proposed in [15]. To reduce performance penalty due to retrieval of backend copies for
corrupted blocks, a periodic scrubbing mechanism verifies the integrity of blocks protected by EDC
and replenishes corrupted data. These schemes are orthogonal to the current proposal and can be
used in tandem with the proposed codes to enhance the performance of their general SEC portion.
For on-chip cache memories, OLS codes in general have high memory overhead which prohibits
their adoption, while the low memory overhead Hamming code can possibly lead to a performance
bottleneck for low latency applications in the future. In this research work, a new single error
correction scheme is proposed which trades off the low decoding latency of OLS codes to optimize
the data redundancy. The proposed codes are targeted towards applications which need high
performance, while allowing some leeway in terms of memory overhead. The rest of the paper is
organized as follows. Section 2 gives background information on Hamming codes and OLS codes.
Section 3 describes the proposed work as well as an optimization technique to reduce the latency of
the proposed codes by slightly trading off memory overhead. Section 4 evaluates the proposed work
against Hamming codes and OLS codes. Section 5 presents the conclusion of this research work.
2. Background Information
Majority of single error correcting codes can be divided into two distinct types. One is a direct
syndrome matching based error correcting code, the most famous example of which is a Hamming
code. The other is a majority voting-based error correcting code, e.g., orthogonal Latin square code.
The next subsections describe these two basic codes in further detail.
Electronics 2020, 9, 709 3 of 16
>1 Error
C1 NC1 NC2 C2
Figure 1. Illustration of distance-3 code and the relationship between codewords and non-
codewords.
𝑆𝑦𝑛𝑑𝑟𝑜𝑚𝑒 𝑆 = 𝐻 × 𝑐 (1)
The construction procedure simply involves listing all possible columns of size log2(n), except
the all-zero column. All columns with weight-1 are the parity portion of the matrix. The rest of the
matrix represents the data portion. In order to compute the parity bits, the XOR of all the data bits,
whose corresponding column value is 1 for a particular row, is computed. The parity portion of the
H-matrix is ignored during code construction. An example of the parity check matrix for a (7, 4)
Hamming code with its parity equations is shown in Figure 2, where d represents data bits and p
represents parity bits.
d0 d1 d2 d3 p0 p1 p2
1 1 1 0 1 0 0
H=
1 1 0 1 0 1 0
1 0 1 1 0 0 1
Figure 2. Parity check matrix of a (7, 4) binary Hamming code.
The encoding and decoding circuit of a (7, 4) Hamming code is shown in Figure 3. The decoding
procedure involves computing the syndrome, as shown in Equation (1). This essentially translates to
a bunch of XOR operations between the stored data bits and the parity bits. If the computed syndrome
is 0, then the codeword is non-erroneous and there is no need for any correction. For any single error,
Electronics 2020, 9, 709 4 of 16
the syndrome bits will be identical to the column of the corresponding data bit that has flipped.
Consider the code in Figure 2 and assume that d2 has an error. In that case the new word y can be
represented as the actual codeword c added with an error vector e. The error vector e has an entry of
1 only at the 3rd position since bit d2 is in error, as shown in Equation (2). The syndrome of this word
is now the syndrome generated from the error vector, since the syndrome of a codeword is 0, as
shown in Equation (3). Thus, any single error is identified by the syndrome it produces. The decoding
procedure then involves matching the syndrome to the corresponding columns of a data bit, as shown
in Figure 3.
𝑒 = (0 0 1 0 0 0 0) (2)
𝑆 = 𝐻 × 𝑦 = 𝐻 × (𝑐 + 𝑒) = 𝐻 × 𝑒 (3)
This is because if a data bit is in error, it affects both its syndrome equations and thus the output of
the AND gate is 1. If a data bit is not in error, it can at most corrupt one of the outputs of the AND
gate, thus resulting in a 0. The output of the AND gate is then XORed with the data bit to produce
the correct output.
d0 d1 d2 d3 p0 p1 p2 p3
1 1 0 0 1 0 0 0
H=
0 0 1 1 0 1 0 0
1 0 1 0 0 0 1 0
0 1 0 1 0 0 0 1
Figure 4. Parity check matrix of a (8, 4) orthogonal Latin square code.
The encoding procedure involves the computation of parity bits. This is the XOR operation of
all the data bits which are 1 in the row of the parity check matrix for which the parity bit is 1. The
decoding procedure involves the majority vote between the data bit itself and the 2t parity check
equations constructed from the rows of the parity check matrix. Thus, the decoder for data bit di will
have parity check equations from each row of the parity check matrix for which the column di is a 1.
The main advantage of the OLS codes is the simplicity of the decoder circuit, which makes it very
useful for memories with random accesses. The majority logic decoding circuit has very low latency
thereby increasing decoding speed and enabling faster read operations. The encoding circuit and
alternate decoder logic for each bit in Figure 4’s parity check matrix is shown in Figure 5.
Figure 5. Encoding and decoding circuit of a (8, 4) orthogonal Latin square code.
3. Proposed Codes
The key idea of the proposed codes is based on repeating the data portion of the parity check
matrix of a binary single error correcting orthogonal Latin square code. By design, the parity check
matrix of a single error correcting OLS code will have all unique columns in order to identify and
correct all single errors. However, by repeating a SEC OLS code, let’s say for example r times, the
number of non-unique or repeated columns is now r. Thus, if any of the corresponding data bits of
the r repeated columns is in error, then the syndrome will be the same. A simple majority in this case
would actually then mistakenly flip all the data bits corresponding to the repeated columns.
To alleviate this issue of mis-correction, we introduce the notion of groups. A group is a single
data portion of the parity check matrix which does not have any repeated columns within itself. Thus,
within a group, all columns are unique. This essentially ensures that any single error within a group
can easily be identified and corrected. Since there can be non-unique columns between different
Electronics 2020, 9, 709 6 of 16
groups, we introduce additional rows in the parity check matrix to differentiate between different
groups. The total number of additional parity bits needed for this purpose is given by pg = ceiling
(log2 g), where g is the total number of groups. Thus, if any parity check matrix is repeated r times,
then g = r+1.
An example of a parity check matrix with number of data bits k = 4 and a repetition of three is
shown in Figure 6. In this case, the different groups have been placed inside the differently colored
boxes. The lower two rows (inside the blue box) are used to identify which group the error belongs
to. The total number of data bits that can be protected using the above configuration is 16.
Repetitions (3)
Group-0 Group-1 Group-2 Group-3 Parity Bits
d0 d1 d2 d3 d4 d5 d6 d7 d8 d9 d10 d11 d12 d13 d14 d15 p0 p1 p2 p3 p4 p5
1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 0 0 0 0 0
0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 1 0 0 0 0
1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 0 0 1 0 0 0
H=
0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 0 0 1 0 0
Group 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 0 0 0 0 1 0
Identification
Columns
{ 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1 0 0 0 0 0 1
Figure 6. Parity check matrix of the proposed codes with four groups of four data bits each.
As an example, let’s consider that data bit d4 is in error. The corresponding syndrome bits for
this error is given by Equation (4). Now, if any of the data bits d0, d8 or d12 are in error instead of d4,
the syndrome bits S0:S3 will still be the same. Thus, a simple majority vote in this case is not sufficient
to identify the erroneous bit. Instead, an additional column match is required, apart from the majority
voting logic, to identify which group the error belongs to. Matching the syndrome bits S4:S5 to the
lower two rows of the parity check matrix indicates that the error must lie within Group-1. Since all
columns within a group are unique, once the group within which the error has occurred is identified,
a majority vote can capture which data bit is in error. Thus, taking a majority vote for each data bit in
Group-1 yields the correct data and flips the erroneous bit d4.
𝑆 1
𝑆 0
⎛ ⎞ ⎛1⎞
𝑆
𝑆𝑦𝑛𝑑𝑟𝑜𝑚𝑒 𝑆 = ⎜ ⎟= ⎜ ⎟ (4)
⎜𝑆 ⎟ ⎜0⎟
𝑆 0
⎝𝑆 ⎠ ⎝1⎠
Thus, the proposed codes share the same majority voter for the data bits corresponding to the
repeated columns across all groups. The group identification is done by matching the syndrome bits
to the bits in the lower pg rows (i.e., group identifying bits), similar to the syndrome matching-based
decoding of Hamming codes. We discuss the detailed encoding and decoding procedure in the
subsequent subsections. The detection of double errors is not the focus of the paper, since to facilitate
a double error detection mechanism, a parity bit computed from the XOR of all the data bits is
enough. In all types of codes, if the syndrome corresponding to this parity bit is zero, while other
syndrome bits are non-zero, then a double error has occurred, and an uncorrectable error flag can be
raised. This is true for traditional Hamming codes and orthogonal Latin square codes, and the same
mechanism can be used for the proposed codes as well.
is 1, to the particular row. By design, any row has exactly one of the parity bits as 1. The parity bit
value is the XOR operation of all the data bits whose corresponding column value is a 1 for the
particular row. As an example, consider the parity bit p0 in Figure 6. In order to compute p0, we look
at all the columns that are 1 in the first row, which is given by Equation (5).
𝑝 = 𝑑 ⊕𝑑 ⊕𝑑 ⊕𝑑 ⊕𝑑 ⊕𝑑 ⊕𝑑 ⊕𝑑 (5)
In terms of hardware resources, the parity bits can be computed by taking the XOR of the
corresponding data bits. Thus, the parity bits can be constructed in a single step or single cycle. The
parity bits can then either be appended to the original data bits and stored in memory, or they can be
stored in a separate location relative to the address of the data bits being stored. The total number of
parity bits p for k data bits and R repetitions of the base parity check matrix is shown in Equation (6).
The total number of data bits protected by these p bits is k(R + 1).
d0 d1 d4 d5 d8 d9 d12 d13 d2 d3 d6 d7 d10 d11 d14 d15 d8 d9 d10 d11 d12 d13 d14 d15
Computation
Syndrome
S0 S1 S4
d0 d2 d4 d6 d8 d10 d12 d14 d1 d3 d5 d7 d9 d11 d13 d15 d4 d5 d6 d7 d12 d13 d14 d15
S2 S3 S5
Majority Voters Group Identifiers
S0 S1 S2 S3 S0 S2 S1 S3 S4 S5 S4 S5 S4 S5 S4 S5
Correction
Data
d0 d1 d2 d3 d4 d5 d6 d7 d8 d9 d10 d11 d12 d13 d14 d15
Figure 7. Decoding circuit of the proposed codes for four groups with four data bits each
The above procedure leads to much lower redundancy due to the shared nature of parity bits.
In comparison to a traditional OLS code, which would have had 16 majority voting circuits, i.e., one
for each data bit, the proposed codes have 4 majority voters belonging to the base code, which are
shared across groups. The disadvantage of such a method is that it increases the logic depth of the
circuit, which leads to a slightly higher latency compared to traditional orthogonal Latin square
codes.
latency compared to Hamming codes. The proposed codes are highly configurable, and depending
on the application requirements, can be configured to have either low memory overhead
(redundancy) or lower decoding latency to enable high throughput.
Repetitions (3)
Group-0 Group-1 Group-2 Group-3 Parity Bits
d0 d1 d2 d3 d4 d5 d6 d7 d8 d9 d10 d11 d12 d13 d14 d15 p0 p1 p2 p3 p4 p5 p6 p7
1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 0 0 0 0 0 0 0
0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 1 0 0 0 0 0 0
1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 0 0 1 0 0 0 0 0
H=
0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 0 0 1 0 0 0 0
1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0
Identification
Columns
Group
0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0
0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 1 0
0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 1
Electronics 2020, 9, 709 10 of 16
Figure 8. Parity check matrix for proposed latency optimization with four groups of four data bits
each.
d15
d14
d7
d13
S4 d6
d12 S5
S5 d5
d7
d4 d4
S0 d4
d6
S0 d4 d4
S1
d5 S1
d4 Original Proposal Latency Optimization
Figure 9. Latency optimization of decoder circuit due to reduction in logic depth.
𝑔
𝑆 , = max ( 𝑔√𝑘, ×𝑘) (7)
2
𝑆 , = max ( 𝑔√𝑘, 𝑘) (8)
The second part of the latency optimization involves data correction, as shown in Figure 9. For
the unoptimized case, since multiple groups can have 1s in them, all the bits are involved in data
correction. Thus, the maximum number of inputs involved in computing the final decision (i.e.,
whether the bit has flipped or not) will include two syndrome bits from the base OLS code and all
the bits from the group identification column, as shown in Equation (9). Thus, it is a function of the
total number of groups. Comparatively, for the latency optimized case, the total number of inputs
will always be three, since each group has its own syndrome bit. Thus, errors in any other group will
not change this syndrome bit, and it can be used in decoding.
g = 4, k = 4
g = 8, k = 4
g = 16, k = 256
g = 2, k = 16
g = 2, k = 64
g = 2, k = 256
g = 4, k = 16
g = 4, k = 64
g = 4, k = 256
g = 8, k = 16
g = 8, k = 64
g = 8, k = 256
g = 16, k = 4
g = 16, k = 16
g = 16, k = 64
Configuration
Figure 10. Comparison of maximum decoder logic depth for unoptimized proposal and latency
optimized proposal.
4. Evaluation
The proposed codes are evaluated against both Hamming codes and OLS codes in this section
in terms of data redundancy (memory overhead), encoding latency, encoder area, encoder power
consumption, decoder area, decoder power consumption and decoding latency. All the codes have
been implemented using the Dataflow model in Verilog and have been exhaustively tested for correct
functionality. The codes were also synthesized using the Open Cell library in 15 nm FreePDK
technology [16] using Synopsys Design Compiler. Comparisons have been made for both the
encoding and the decoding circuit between Hamming codes, OLS codes, the basic proposed codes
and the latency optimized proposed codes for different data sizes of k = 32, 64, 128, 256, 512 and 1024
and different group sizes (or repetitions) of g = 2, 4, 8 and 16 based on the number of data bits for the
proposed codes.
Table 1 compares the encoding circuit of the different codes with different repetition values for
the proposed codes. For OLS codes, all the parity equations are consistent, and each parity equation
is a XOR of a fixed number of data bits which is equal to the size of the Latin square. Thus, the encoder
has very low latency. However, they suffer from higher memory overhead (redundancy) due to the
inherent structure of the code wherein fewer data bits are involved in each parity check equation.
Hamming codes on the other hand have very low redundancy, but this comes at the cost of higher
encoding latency, since more data bits are involved in each parity check equation, which increases
the logic depth.
The proposed codes as well as the latency optimized proposed codes strike an adequate balance
between the data redundancy and encoding latency when compared to OLS codes and Hamming
codes. This is because the number of data bits participating in each parity check is more than in OLS
codes, but either equal to Hamming codes (for original proposal) or less than Hamming codes (for
latency optimized version). In all, from the experiments we see that the original version of the
proposed code achieves up to 45% improvement in encoding latency compared to Hamming codes
while achieving up to 68.75% improvement in memory overhead compared to OLS codes. Similarly,
the latency optimized version of the proposed codes achieves up to 49% improvement in encoding
latency compared to Hamming codes while achieving up to 50% improvement in memory overhead
compared to OLS codes.
Table 2 compares the decoding circuit of the different codes. The amount of memory overhead
remains the same for all cases. However, it can be seen that, similar to the encoding circuitry,
Hamming codes are able to achieve the minimum memory overhead while incurring the highest
decoder latency overhead. At the other end of the spectrum, OLS codes achieve very low decoder
latency, but that comes at the cost of significantly higher memory overhead. The proposed codes
achieve a balance between both the decoder latency overhead and memory overhead compared to
Hamming codes and OLS codes. The original version of the proposed codes achieves up to 38%
improvement in decoder latency compared to Hamming codes. The latency optimized version of the
proposed codes achieves up to 43.75% improvement in decoding latency compared to Hamming
codes.
Figure 11 shows the comparison of memory overhead and decoder latency for the different
codes across different numbers of data bits. As can be seen, Hamming codes provide the least
memory overhead but that comes at the cost of high decoder latency. OLS codes instead have very
low decoder latency but come at the expense of high memory overhead. The proposed codes, both
the base version and the latency optimized version, provide a balanced trade-off between decoder
latency and memory overhead.
Electronics 2020, 9, 709 14 of 16
Hamming Codes OLS Codes Proposed Codes Proposed Codes (Latency Optimized)
300
200
150
100
50
0
0 10 20 30 40 50 60 70
#Checkbits
Figure 11. Comparison of memory overhead and decoder latency for the different codes across
different number of data bits.
From both Table 1 and Table 2, it can be seen that OLS codes have an overall good performance
in terms of latency, area and power consumption, but OLS codes do suffer from high memory
overhead. The proposed codes focus on reducing the high memory overhead of OLS codes while still
maintaining an adequate decoder latency to enable good performance. Thus, a memory-overhead
delay product (MODP) metric is used to make this comparison. Figure 12 shows the MODP
comparison normalized to OLS codes (i.e., OLS codes will have a MODP of 1) for both the
unoptimized proposal and latency optimized proposal. Since there are multiple possibilities of the
number of groups for the proposed codes, the best MODP value is plotted against OLS codes. It can
be seen that the proposed codes are able to achieve much better MODP compared to OLS codes, with
the MODP as low as 0.33 times OLS codes. This is possible due to the low memory overhead of the
proposed codes. The key takeaway from this figure is that the proposed codes are able to achieve
much lower memory overhead without a corresponding significant rise in decoding latency.
1.20
1.00
1.00
1.00
1.00
1.00
1.00
MODP (normalized to OLS Codes)
0.86
1.00
0.75
0.66
0.66
0.80
0.58
0.57
0.55
0.50
0.60
0.43
0.41
0.33
0.40
0.20
0.00
32 64 128 256 512 1024
#Data bits
Figure 12. Comparison of memory overhead-delay product of decoder between OLS codes and
proposed codes.
Electronics 2020, 9, 709 15 of 16
5. Conclusions
In this paper, a new single error correcting code was presented based on a shared majority voting
decoding logic. The proposed codes trade off decoder latency for an improvement in memory
overhead by sharing the majority voting logic across groups with a repeated parity check matrix. This
allows for the use of a much lower degree Latin square, owing to the repetition, than would have
been used otherwise. Experiments and comparison to existing OLS codes show that the proposed
codes achieve significant improvement in terms of memory overhead while incurring a slight
overhead in decoding and encoding latency. A latency optimization technique is also presented
which improves the decoding latency while incurring a slight penalty on memory overhead.
However, the overall memory overhead is still lower than OLS codes. It is also shown that the
proposed codes achieve much better decoding latency compared to the prevalent Hamming codes.
Thus, the proposed codes can provide an excellent balance/trade-off between memory overhead and
decoding latency, specifically for on-chip memory applications, which need the low decoding latency
not found in a Hamming code but do not have enough resources to tolerate the high memory
overhead of an OLS code.
Author Contributions: Conceptualization, A.D. and N.A.T.; Methodology, A.D.; Investigation, A.D.; Validation,
A.D.; Resources, N.A.T.; Writing – Original Draft Preparation, A.D.; Writing – Review & Editing, A.D.;
Supervision, N.A.T. All authors have read and agreed to the published version of the manuscript.
References
1. Yan, J.; Zhang, W. Evaluating Instruction Cache Vulnerability to Transient Errors. In Proceedings of the
ACM Workshop on Memory Performance: Dealing with Applications, Systems and Architectures
(MEDEA), Seattle, WA, USA 1-2 September 2006; pp. 21–28, doi:10.1145/1166133.1166136.
2. Tang, L.; Mars, J.; Vachharajani, N.; Hundt, R.; Soffa, M.L. The Impact of Memory Subsystem Resource
Sharing on Datacenter Applications. In Proceedings of the 38th International Symposium on Computer
Architecture (ISCA), San Jose, CA, USA, 4–8 June 2011; pp. 283–294, doi:10.1145/2000064.2000099.
3. Baumann, R.C. Radiation-Induced Soft Errors in Advanced Semiconductor Technologies. IEEE Trans.
Device Mater. Reliab. 2005, 5, 305–316, doi:10.1109/TDMR.2005.853449.
4. Frumusanu, A. The Apple IPhone 11, 11 Pro & 11 Pro Max Review: Performance, Battery, & Camera
Elevated. Available online: https://www.anandtech.com/show/14892/the-apple-iphone-11-pro-and-max-
review (accessed on 16 October 2019).
5. Imani, M.; Patil, S.; Rosing, T. Low Power Data-Aware STT-RAM Based Hybrid Cache Architecture. In
Proceedings of the IEEE International Symposium on Quality Electronic Design (ISQED), Santa Clara, CA,
USA, 15–16 March 2016; pp. 88–94, doi:10.1109/ISQED.2016.7479181.
6. Hsiao, M.Y.; Bossen, D.C.; Chien, R.T. Orthogonal Latin Square codes. IBM J. Res. Dev. 1970, 14, 390–394,
doi:10.1147/rd.144.0390.
7. Wilkerson, C.; Gao, H.; Alameldeen, A.R.; Chishti, Z.; Khellah, M.; Lu, S-L. Trading off Cache Capacity for
Reliability to Enable Low Voltage Operation. In Proceedings of the ACM International Symposium on
Computer Architecture (ISCA), Beijing, China, 21–25 June 2008; pp. 203–214, doi:10.1109/ISCA.2008.22.
8. Datta, R.; Touba, N.A. Post-Manufacturing ECC Customization Based on Orthogonal Latin Square Codes
and Its Application to Ultra-Low Power Caches. In Proceedings of the IEEE International Test Conference,
Austin, TX, USA, 2–4 November 2010; doi:10.1109/TEST.2010.5699221.
9. Hamming, R.W. Error Detecting and Error Correcting Codes. Bell Syst. Tech. J. 1950, 29, 147–160,
doi:10.1002/j.1538-7305.1950.tb00463.x.
10. Das, A.; Touba, N.A. Low Complexity Burst Error Correcting Codes to Correct MBUs in SRAMs. In
Proceedings of the ACM Great Lakes Symposium on VLSI (GLSVLSI), Chicago, IL, USA, 23–25 May 2018;
pp. 219–224, doi:10.1145/3194554.3194570.
11. Adalid, L.S.; Gil, P.; Gil-Tomás, J.; Gr, D.; Baraza-Calvo, J.C. Ultrafast Single Error Correction Codes for
Protecting Processor Registers. In Proceedings of the IEEE European Dependable Computing Conference
(EDCC), Paris, France, 7–11 September 2015; pp. 144–154, doi:10.1109/EDCC.2015.30.
Electronics 2020, 9, 709 16 of 16
12. Alam, I.; Schoeny, C.; Dolecek, L.; Gupta, P. Parity++: Lightweight Error Correction for Last Level Caches.
In Proceedings of the IEEE/IFIP International Conference on Dependable Systems and Networks
Workshops (DSN-W), Luxembourg City, Luxembourg, 25–28 June 2018; pp. 114–120, doi:10.1109/DSN-
W.2018.00048.
13. Schoeny, C.; Sala, F.; Gottscho, M.; Alam, I.; Gupta, P.; Dolecek, L. Context-Aware Resiliency: Unequal
Message Protection for Random-Access Memories. IEEE Trans. Inf. Theory 2019, 65, 6146–6159,
doi:10.1109/TIT.2019.2918209.
14. Alameldeen, A.R.; Wagner, I.; Chishti, Z.; Wu, W.; Wilkerson, C.; Lu, S-L. Energy-Efficient Cache Design
Using Variable-Strength Error-Correcting Codes. In Proceedings of the ACM International Symposium on
Computer Architecture (ISCA), San Jose, CA, USA, 4–8 June 2011; pp. 461–471,
doi:10.1145/2000064.2000118.
15. Huang, P.; Subedi, P.; He, X.; He, S.; Zhou, K. FlexECC: Partially relaxing ECC of MLC SSD for better cache
performance. In Proceedings of the USENIX Annual Technical Conference, Philadelphia, PA, USA, 19–20
June 2014; pp. 489–500.
16. Martins, M.; Matos, J.M.; Ribas, R.P.; Reis, A.; Schlinker, G.; Rech, L.; Michelsen, J. Open Cell Library in
15nm FreePDK Technology. In Proceedings of the ACM International Symposium on Physical Design
(ISPD), Monterey, CA, USA, 29 March–1 April 2015; pp. 171–178, doi:10.1145/2717764.2717783.
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).