Chap 02
Chap 02
2.6.2 EBCDIC 63
• EBCDIC (Extended Binary Coded Decimal Interchange Code) expand BCD from 6
bits to 8 bits. See Page 64 FIGURE 2.6.
2.6.4 Unicode 65
• Both EBCDIC and ASCII were built around the Latin alphabet.
• In 1991, a new international information exchange code called Unicode.
• Unicode is a 16-bit alphabet that is downward compatible with ASCII and Latin-1
character set.
• Because the base coding of Unicode is 16 bits, it has the capacity to encode the
majority of characters used in every language of the world.
• Unicode is currently the default character set of the Java programming language.
• If the limiting factor in the design of a disk is the number of flux transitions per
square millimeter, we can pack 50% more OKs in the same magnetic are using RLL
than we could using MFM.
• RLL is used almost exclusively in the manufacture of high-capacity disk drive.
o If no bits are lost or corrupted, dividing the received information string by the
agreed upon pattern will give a remainder of zero.
o We see this is so in the calculation at the right.
o Real applications use longer polynomials to cover larger information strings.
• A remainder other than zero indicates that an error has occurred in the transmission.
• This method work best when a large prime polynomial is used.
• There are four standard polynomials used widely for this purpose:
o CRC-CCITT (ITU-T): X16 + X12 + X5 + 1
o CRC-12: X12 + X11 + X3 + X2 + X + 1
o CRC-16 (ANSI): X16 + X15 + X2 + 1
o CRC-32: X32 + X26 + X23 + X22 + X16 + X12 + X11 + X10 + X8 + X7 + X6 + X4 + X
+1
• CRC-32 has been proven that CRCs using these polynomials can detect over 99.8%
of all single-bit errors.
• We are focused on single bit error. An error could occur in any of the n bits, so each
code word can be associated with n erroneous words at a Hamming distance of 1.
• Therefore, we have n + 1 bit patterns for each code word: one valid code word, and n
erroneous words. With n-bit code words, we have 2n possible code words consisting
of 2m data bits (where m = n + r).
This gives us the inequality:
(n + 1) * 2m < = 2n
Because m = n + r, we can rewrite the inequality as:
(m + r + 1) * 2m <= 2 m + r or (m + r + 1) <= 2r
Let’s introduce an error in bit position b9, resulting in the code word:
0 1 0 1 1 1 0 1 0 1 1 0
12 11 10 9 8 7 6 5 4 3 2 1
We found that parity bits 1 and 8 produced an error, and 1 + 8 = 9, which in exactly
where the error occurred.
2.8.3 Reed-Soloman 82
• If we expect errors to occur in blocks, it stands to reason that we should use an error-
correcting code that operates at a block level, as opposed to a Hamming code, which
operates at the bit level.
• A Reed-Soloman (RS) code can be thought of as a CRC that operates over entire
characters instead of only a few bits.
• RS codes, like CRCs, are systematic: The parity bytes are append to a block of
information bytes.
• RS (n, k) code are defined using the following parameters:
o s = The number of bits in a character (or “symbol”).
o k = The number of s-bit characters comprising the data block.
o n = The number of bits in the code word.
• RS (n, k) can correct (n-k)/2 errors in the k information bytes.
Chapter Summary 83
• Computers store data in the form of bits, bytes, and words using the binary
numbering system.
• Hexadecimal numbers are formed using four-bit groups called nibbles (or nybbles).
• Signed integers can be stored in one’s complement, two’s complement, or signed
magnitude representation.
• Floating-point numbers are usually coded using the IEEE 754 floating-point
standard.
• Character data is stored using ASCII, EBCDIC, or Unicode.
• Data transmission and storage codes are devised to convey or store bytes reliably
and economically.
• Error detecting and correcting codes are necessary because we can expect no
transmission or storage medium to be perfect.
• CRC, Reed-Soloman, and Hamming codes are three important error control codes.