Using Single Error Correction Codes To Protect Against Isolated Defects and Soft Errors
Using Single Error Correction Codes To Protect Against Isolated Defects and Soft Errors
Acknowledgements
Authors: Costas Argyrides, Pedro Reviriego
and Juan Antonio Maestro
Abstract
Different techniques have been used to deal
with defects and soft errors
Impact on reliability
Introduction
Reliability issues
Current Techniques
ECC can also be used to correct errors by
defects
An effective technique for ECC to deal with
isolated stuck-at-defects and soft errors on
memory chip is presented
Related Work
Different techniques at various stages
1-D redundancy approach
Simple algorithm
But low repair efficiency
Continued
Submicron technology issue
Interleaving approach for multiple errors
Scrubbing
Permanent or temporary errors
Example 1
2 bits affected by errors: One soft error and another defect
This is uncorrectable
Example 2
Two soft errors and a defect
Example 3
Two soft errors and a defect
Continued
Case 3: 1 defect(defect value stuck at bits
opposite value)+1 soft error
The syndrome will mark a double error
The word will be copied into a register
Write all zeros to the word, read it back
Write all ones to the word, read it back
Locate the permanent error (from previous 2
steps)
Change the value of that bit(defect), and reevaluate the syndrome
Then decode as normally done in SEC-DED
Continued
Case 4: 1 defect(defect value stuck at bits
opposite value)+2 soft errors (Triple error
detectable)
The syndrome will mark a triple error
The word will be copied into a register
Write all zeros to the word, read it back
Write all ones to the word, read it back
Locate the permanent error (from previous 2
steps)
Then output the signal for Double Error
Continued
Case 5: 1 defect (defect value stuck-at bits
value) + 2 soft errors
The syndrome will mark a double error
The word will be copied into the register
Write all zeros to the word, read it back
Write all ones to the word, read it back
Locate the permanent error (defect) (from
previous 2 steps)
Change the value (a new error will be introduced)
of that bit (defect), re-evaluate the syndrome
If triple error is detectable, then output signal for
Double Error; else, word will be read miscorrected
Analysis
Analysis on reliability of a memory on which SEC-DED is
used to deal with soft errors and isolated stuck at failures
Mean number of events to failure(METF)
1
METF | permanent
F
And the soft errors only would be
(1)
(2)
Continued
Failures caused by soft errors only will be
dominant when
METF|permanent >> METF|soft_errors
If we define the METF ratio as
rF
.M
(3)
(4)
r F .M 1
2
(5)
Continued
Continued
Conclusion
Use of SEC-DED to deal with both soft errors
and isolated stuck at defects
Technique to deal with both types of errors
Can also be combined with traditional 2-D
repair approaches
References
W. K. Huang, Y.-N. Shen, and F. Lombardi, New approaches for the repairs
of memories with redundancy by row/column deletion for yield
enhancement, IEEE Trans. Computer-Aided Des. Integr. Circuits Syst., vol.
9, no. 3, pp. 323328, Mar. 1990.
R. W. Hamming, Error detecting and error correcting codes, Bell Syst.
Techn. J., vol. 26, no. 2, pp. 147160, 1950.
M. Blaum, R. Goodman, and R. McEliece, The reliability of single error
protected computer memories, IEEE Trans. Comput., vol. 37, no. 1, pp.
114119, Jan. 1988.
V. Shridhar, M. Rajendra Prasad, Built-in self-repair (BISR) technique
widely Used to repair embedded random access memories (RAMs),
International Journal of Computer Science Engineering (IJCSE)., vol. 1, no.
01, Sep. 2012.