Hamming Code
Hamming Code
In contrast, the simple parity code cannot detect errors where two bits are transposed, nor
can it correct the errors it can find.
In mathematical terms, hamming codes are a class of binary linear codes. For each
integer m > 1 there is a code with parameters: [2m − 1,2m − m − 1,3]. The parity-check
matrix of a Hamming code is constructed by listing all columns of length m that are pair-
wise independent.
Contents
[hide]
• 1 History
o 1.1 Codes predating Hamming
1.1.1 Parity
1.1.2 Two-out-of-five code
1.1.3 Repetition
• 2 Hamming codes
o 2.1 General algorithm
• 3 Hamming codes with additional parity
• 4 Hamming(7,4) code
• 5 Hamming(8,4) code
• 6 Hamming(11,7) code
• 7 See also
• 8 References
• 9 External links
[edit] History
Hamming worked at Bell Labs in the 1940s on the Bell Model V computer, an
electromechanical relay-based machine with cycle times in seconds. Input was fed in on
punch cards, which would invariably have read errors. During weekdays, special code
would find errors and flash lights so the operators could correct the problem. During
after-hours periods and on weekends, when there were no operators, the machine simply
moved on to the next job.
Hamming worked on weekends, and grew increasingly frustrated with having to restart
his programs from scratch due to the unreliability of the card reader. Over the next few
years he worked on the problem of error-correction, developing an increasingly powerful
array of algorithms. In 1950 he published what is now known as Hamming Code, which
remains in use in some applications today.
A number of simple error-detecting codes were used before Hamming codes, but none
were as effective as Hamming codes in the same overhead of space.
[edit] Parity
Parity adds a single bit that indicates whether the number of 1 bits in the preceding data
was even or odd. If a single bit is changed in transmission, the message will change parity
and the error can be detected at this point. (Note that the bit that changed may have been
the parity bit itself!) The most common convention is that a parity value of 1 indicates
that there is an odd number of ones in the data, and a parity value of 0 indicates that there
is an even number of ones in the data. In other words: The data and the parity bit
together should contain an even number of 1s.
Parity checking is not very robust, since if the number of bits changed is even, the check
bit will be valid and the error will not be detected. Moreover, parity does not indicate
which bit contained the error, even when it can detect it. The data must be discarded
entirely, and re-transmitted from scratch. On a noisy transmission medium a successful
transmission could take a long time, or even never occur. While parity checking is not
very good, it uses only a single bit, resulting in the least overhead, and does allow for the
restoration of a missing bit, when which bit is missing is known.
In the 1940s Bell used a slightly more sophisticated m of n code known as the two-out-
of-five code. This code ensured that every block of five bits (known as a 5-block) had
exactly two 1s. The computer could tell if there was an error if in its input there were not
exactly two 1s in each block. Two-of-five was still only able to detect single bits; if one
bit flipped to a 1 and another to a 0 in the same block, the two-of-five rule remained true
and the error would go undiscovered.
[edit] Repetition
Another code in use at the time repeated every data bit several times in order to ensure
that it got through. For instance, if the data bit to be sent was a 1, an n=3 repetition code
would send "111". If the three bits received were not identical, an error occurred. If the
channel is clean enough, most of the time only one bit will change in each triple.
Therefore, 001, 010, and 100 each correspond to a 0 bit, while 110, 101, and 011
correspond to a 1 bit, as though the bits counted as "votes" towards what the original bit
was. A code with this ability to reconstruct the original message in the presence of errors
is known as an error-correcting code.
Such codes cannot correctly repair all errors, however. In our example, if the channel
flipped two bits and the receiver got "001", the system would detect the error, but
conclude that the original bit was 0, which is incorrect. If we increase the number of
times we duplicate each bit to four, we can detect all two-bit errors but can't correct them
(the votes "tie"); at five, we can correct all two-bit errors, but not all three-bit errors.
Moreover, the repetition code is extremely inefficient, reducing throughput by three times
in our original case, and the efficiency drops drastically as we increase the number of
times each bit is duplicated in order to detect and correct more errors.
Hamming studied the existing coding schemes, including two-of-five, and generalized
their concepts. To start with he developed a nomenclature to describe the system,
including the number of data bits and error-correction bits in a block. For instance, parity
includes a single bit for any data word, so assuming ASCII words with 7-bits, Hamming
described this as an (8,7) code, with eight bits in total, of which 7 are data. The repetition
example would be (3,1), following the same logic. The information rate is the second
number divided by the first, for our repetition example, 1/3.
Hamming also noticed the problems with flipping two or more bits, and described this as
the "distance" (it is now called the Hamming distance, after him). Parity has a distance of
2, as any two bit flips will be invisible. The (3,1) repetition has a distance of 3, as three
bits need to be flipped in the same triple to obtain another code word with no visible
errors. A (4,1) repetition (each bit is repeated four times) has a distance of 4, so flipping
two bits can be detected, but not corrected. When three bits flip in the same group there
can be situations where the code corrects towards the wrong code word.
Hamming was interested in two problems at once; increasing the distance as much as
possible, while at the same time increasing the information rate as much as possible.
During the 1940s he developed several encoding schemes that were dramatic
improvements on existing codes. The key to all of his systems was to have the parity bits
overlap, such that they managed to check each other as well as the data.
Although any number of algorithms can be created the following general algorithm
positions the parity bits at powers of two to ease calculation of which bit was flipped
upon detection of incorrect parity.
1. All bit positions that are powers of two are used as parity bits. (positions 1, 2, 4, 8,
16, 32, 64, etc.), see A000079 at the On-Line Encyclopedia of Integer Sequences.
2. All other bit positions are for the data to be encoded. (positions 3, 5, 6, 7, 9, 10,
11, 12, 13, 14, 15, 17, etc.), see A057716 at the On-Line Encyclopedia of Integer
Sequences.
3. Each parity bit calculates the parity for some of the bits in the code word. The
position of the parity bit determines the sequence of bits that it alternately checks
and skips.
o Position 1 (n=1): skip 0 bit (0=n-1), check 1 bit (n), skip 1 bit (n), check 1
bit (n), skip 1 bit (n), etc.
o Position 2 (n=2): skip 1 bit (1=n-1), check 2 bits (n), skip 2 bits (n), check
2 bits (n), skip 2 bits (n), etc.
o Position 4 (n=4): skip 3 bits (3=n-1), check 4 bits (n), skip 4 bits (n),
check 4 bits (n), skip 4 bits (n), etc.
o Position 8 (n=8): skip 7 bits (7=n-1), check 8 bits (n), skip 8 bits (n),
check 8 bits (n), skip 8 bits (n), etc.
o Position 16 (n=16): skip 15 bits (15=n-1), check 16 bits (n), skip 16 bits
(n), check 16 bits (n), skip 16 bits (n), etc.
o Position 32 (n=32): skip 31 bits (31=n-1), check 32 bits (n), skip 32 bits
(n), check 32 bits (n), skip 32 bits (n), etc.
o General rule for position n: skip n-1 bits, check n bits, skip n bits, check n
bits...
o And so on.
In other words, the parity bit at position 2k checks bits in positions having bit k set in their
binary representation. Conversely, for instance, bit 13, i.e. 1101(2), is checked by bits
1000(2) = 8, 0100(2)=4 and 0001(2) = 1.
In general if the minimum distance of any error correction code is given by W, then
where the function integer() returns the integer part of the division without performing
any rounding. (citation needed on both equations)
Graphical depiction of the 4 data bits and 3 parity bits and which parity bits apply to
which data bits
Main article: Hamming(7,4)
In 1950, Hamming introduced the (7,4) code. It encodes 4 data bits into 7 bits by adding
three parity bits. Hamming(7,4) can detect and correct single-bit errors but can only
detect double-bit errors.
For example, 1011 is encoded into 0110011 where blue digits are data and red digits are
parity.
The same (7,4) example from above with an extra parity bit
The Hamming(7,4) can easily be extended to an (8,4) code by adding an extra parity bit
on top of the (7,4) encoded word (see above section). This can be summed up in a revised
parity-check matrix:
The addition of the fourth row computes the sum of all bits (data and parity) as the fourth
parity bit.
For example, 1011 is encoded into 01100110 where blue digits are data; red digits are
parity from the Hamming(7,4) code; and the green digit is the parity added by
Hamming(8,4). The green digit makes the parity of the (7,4) code even.
Mapping in the example data value. The parity of the red, yellow, green, and blue circles
are even.
A bit error on bit 11 causes bad parity in the red, yellow, and green circles
Consider the 7-bit data word "0110101". To demonstrate how Hamming codes are
calculated and used to detect an error, see the tables below. They use d to signify data
bits and p to signify parity bits.
Firstly the data bits are inserted into their appropriate positions and the parity bits
calculated in each case using even parity. The diagram to the right shown which of the
four parity bits cover which data bits.
p1 p2 d1 p3 d2 d3 d4 p4 d5 d6 d7
p1 1 0 1 0 1 1
p2 0 0 1 0 0 1
p3 0 1 1 0
p4 0 1 0 1
The new data word (with parity bits) is now "10001100101". We now assume the final
bit gets corrupted and turned from 1 to 0. Our new data word is "10001100100"; and this
time when we analyse how the Hamming codes were created we flag each parity bit as 1
when the even parity check fails.
p1 1 0 1 0 1 0 Fail 1
p2 0 0 1 0 0 0 Fail 1
p3 0 1 1 0 Pass 0
p4 0 1 0 0 Fail 1
The final step is to evaluate the value of the parity bits (remembering the bit with lowest
index is the least significant bit, i.e., it goes furthest to the right). The integer value of the
parity bits is 11, signifying that the 11th bit in the data word (including parity bits) is
wrong and needs to be flipped.
p4 p3 p2 p1
Binary 1 0 1 1
Decimal 8 2 1 Σ = 11
Flipping the 11th bit changes 10001100100 back into 10001100101. Removing the
Hamming codes gives the original data word of 0110101.
Note that as parity bits do not check each other, if a single parity bit check fails and all
others succeed, then it is the parity bit in question that is wrong and not any bit it checks.
Finally, suppose two bits change, at positions x and y. If x and y have the same bit at the
2k position in their binary representations, then the parity bit corresponding to that
position checks them both, and so will remain the same. However, some parity bit must
be altered, because x ≠ y, and so some two corresponding bits differ in x and y. Thus, the
Hamming code detects all two bit errors — however, it cannot distinguish them from 1-
bit errors.
• Error
detection is the
ability to detect
errors caused by
noise or other
impairments during
transmission from
the transmitter to the
receiver.
• Error
correction has an
additional feature
that enables
identification and
correction of the
errors.
• Error
detection always
precedes error
correction.
Our goal is error correction. There are two ways to design the channel code and protocol
for an error correcting system.
• Automatic
repeat request
(ARQ): The
transmitter sends the
data and also a
"check code", which
the receiver uses to
check for errors. If it
does not find any
errors, it sends a
message (an ACK,
or acknowledgment)
back to the
transmitter. The
transmitter re-
transmits any data
that was not ACKed.
• Forward error
correction (FEC):
The transmitter
encodes the data
with an error-
correcting code and
sends the coded
message. The
receiver never sends
any messages back
to the transmitter.
The receiver decodes
what it receives into
the "most likely"
data. The codes are
designed so that it
would take an
"unreasonable"
amount of noise to
trick the receiver into
misinterpreting the
data.
Variations on this theme exist. Given a stream of data that is to be sent, the data is broken
up into blocks of bits, and in sending, each block is sent some predetermined number of
times. For example, if we want to send "1011", we may repeat this block three times
each.
Suppose we send "1011 1011 1011", and this is received as "1010 1011 1011". As one
group is not the same as the other two, we can determine that an error has occurred. This
scheme is not very efficient, and can be susceptible to problems if the error occurs in
exactly the same place for each group (e.g. "1010 1010 1010" in the example above will
be detected as correct in this scheme).
The scheme however is extremely simple, and is in fact used in some transmissions of
numbers stations.
The stream of data is broken up into blocks of bits, and the number of 1 bits is counted.
Then, a "parity bit" is set (or cleared) if the number of one bits is odd (or even). (This
scheme is called even parity; odd parity can also be used.) If the tested blocks overlap,
then the parity bits can be used to isolate the error, and even correct it if the error affects a
single bit: this is the principle behind the Hamming code.
There is a limitation to parity schemes. A parity bit is only guaranteed to detect an odd
number of bit errors (one, three, five, and so on). If an even number of bits (two, four, six
and so on) are flipped, the parity bit appears to be correct, even though the data are
corrupt.
One less commonly used form of error correction and detection is transmitting a polarity
reversed bitstream simultaneously with the bitstream it is meant to correct. This scheme
is very weak at detecting bit errors, and marginally useful for byte or word error detection
and correction. However, at the physical layer in the OSI model, this scheme can aid in
error correction and detection.
Polarity symbol reversal is (probably) the simplest form of Turbo code, but technically
not a Turbo code at all.
• Turbo codes
DO NOT work at the
bit level.
• Turbo codes
typically work at the
character or symbol
level depending on
their placement in
the OSI model.
• Character here
refers to Baudot,
ASCII-7, the 8-bit
byte or the 16-bit
word.
• transmit 1011
on carrier wave 1
(CW1)
• transmit 0100
on carrier wave 2
(CW2)
Receiver end
• do bits
polarities of (CW1)
<> (CW2)?
• if CW1 ==
CW2, signal bit
error (triggers more
complex ECC)
This polarity reversal scheme works fairly well at low data rates (below 300 baud) with
very redundant data like telemetry data.[specify]
More complex error detection (and correction) methods make use of the properties of
finite fields and polynomials over such fields.
The cyclic redundancy check considers a block of data as the coefficients to a polynomial
and then divides by a fixed, predetermined polynomial. The coefficients of the result of
the division is taken as the redundant data bits, the CRC.
On reception, one can recompute the CRC from the payload bits and compare this with
the CRC that was received. A mismatch indicates that an error occurred.
If we want to detect d bit errors in an n bit word we can map every n bit word into a
bigger n+d+1 bit word so that the minimum Hamming distance between each valid
mapping is d+1. This way, if one receives a n+d+1 word that doesn't match any word in
the mapping (with a Hamming distance x <= d+1 from any word in the mapping) it can
successfully detect it as an errored word. Even more, d or less errors will never transform
a valid word into another, because the Hamming distance between each valid word is at
least d+1, and such errors only lead to invalid words that are detected correctly. Given a
stream of m*n bits, we can detect x <= d bit errors successfully using the above method
on every n bit word. In fact, we can detect a maximum of m*d errors if every n word is
transimtted with maximum d errors.
It would be advantageous if the receiver could somehow determine what the error was
and thus correct it. Is this even possible? Yes, consider the NATO phonetic alphabet -- if
a sender were to be sending the word "WIKI" with the alphabet by sending "WHISKEY
INDIA KILO INDIA" and this was received (with * signifying letters received in error)
as "W***KEY I**I* **LO **DI*", it would be possible to correct all the errors here
since there is only one word in the NATO phonetic alphabet which starts with "W" and
ends in "KEY", and similarly for the other words. This idea is also present in some error
correcting codes (ECC).
Error-correcting schemes also have their limitations. Some can correct a certain number
of bit errors and only detect further numbers of bit errors. Codes which can correct one
error are termed single error correcting (SEC), and those which detect two are termed
double error detecting (DED). There are codes which can correct and detect more errors
than these.
An error-correcting code which corrects all errors of up to n bits correctly is also an error-
detecting code which can detect at least all errors of up to 2n bits.
Because soft errors are extremely common in the DRAM of computers used in satellites
and space probes, such memory is structured as ECC memory (also called "EDAC
protected memory"). Typically every bit of memory is refreshed at least 15 times per
second. During this memory refresh, the memory controller reads each word of memory
and writes the (corrected) word back.[citation needed] Such memory controllers traditionally use
a Hamming code, although some use triple modular redundancy. Even though a single
cosmic ray can upset many physically neighboring bits in a DRAM, such memory
systems are designed so that neighboring bits belong to different words, so such single
event upsets (SEUs) cause only a single error in any particular word, and so can be
corrected by a single-bit error correcting code. As long as no more than a single bit in any
particular word is hit by an error between refreshes, such a memory system presents the
illusion of an error-free memory. [1] [2]
[edit] Applications
Applications that require low latency (such as telephone conversations) cannot use
Automatic Repeat reQuest (ARQ); they must use Forward Error Correction (FEC). By
the time an ARQ system discovers an error and re-transmits it, the re-sent data will arrive
too late to be any good.
Applications where the transmitter immediately forgets the information as soon as it is
sent (such as most television cameras) cannot use ARQ; they must use FEC because
when an error occurs, the original data is no longer available. (This is also why FEC is
used in data storage systems such as RAID and distributed data store).
Applications that require extremely low error rates (such as digital money transfers) must
use ARQ.
• Each Ethernet
frame carries a CRC-
32 checksum. The
receiver discards
frames if their
checksums don't
match.
• The IPv4
header contains a
header checksum of
the contents of the
header (excluding
the checksum field).
Packets with
checksums that don't
match are discarded.
• The checksum
was omitted from the
IPv6 header, because
most current link
layer protocols have
error detection.
• UDP has an
optional checksum.
Packets with wrong
checksums are
discarded.
• TCP has a
checksum of the
payload, TCP header
(excluding the
checksum field) and
source- and
destination addresses
of the IP header.
Packets found to
have incorrect
checksums are
discarded and
eventually get
retransmitted when
the sender receives a
triple-ack or a
timeout occurs.
NASA has used many different error correcting codes. For missions between 1969 and
1977 the Mariner spacecraft used a Reed-Muller code. The noise these spacecraft were
subject to was well approximated by a "bell-curve" (normal distribution), so the Reed-
Muller codes were well suited to the situation.
The Voyager 1 & Voyager 2 spacecraft transmitted color pictures of Jupiter and Saturn in
1979 and 1980.
• Color image
transmission
required 3 times the
amount of data, so
the Golay (24,12,8)
code was used.
• This Golay
code is only 3-error
correcting, but it
could be transmitted
at a much higher
data rate.
• Voyager 2
went on to Uranus
and Neptune and the
code was switched to
a concatenated Reed-
Solomon code-
Convolutional code
for its substantially
more powerful error
correcting
capabilities.
• Current DSN
error correction is
done with dedicated
hardware.
• For some
NASA deep space
craft such as those in
the Voyager
program, Cassini-
Huygens (Saturn),
New Horizons
(Pluto) and Deep
Space 1 -- the use of
hardware ECC may
not be feasible for
the full duration of
the mission.
The different kinds of deep space and orbital missions that are conducted suggest that
trying to find a "one size fits all" error correction system will be an ongoing problem for
some time to come.
• For missions
close to the earth the
nature of the "noise"
is different from that
on a spacecraft
headed towards the
outer planets
• In particular,
if a transmitter on a
spacecraft far from
earth is operating at
a low power, the
problem of
correcting for noise
gets larger with
distance from the
earth
Block 2D & 3D bit allocation models used by ECC coding systems in terrestrial
telecommunications
The demand for satellite transponder bandwidth continues to grow, fueled by the desire to
deliver television (including new channels and High Definition TV) and IP data.
Transponder availability and bandwidth constraints have limited this growth, because
transponder capacity is determined by the selected modulation scheme and Forward error
correction (FEC) rate.
Overview
• QPSK coupled
with traditional Reed
Solomon and Viterbi
codes have been
used for nearly 20
years for the delivery
of digital satellite
TV.
• Higher order
modulation schemes
such as 8PSK,
16QAM and
32QAM have
enabled the satellite
industry to increase
transponder
efficiency by several
orders of magnitude.
• This increase
in the information
rate in a transponder
comes at the expense
of an increase in the
carrier power to meet
the threshold
requirement for
existing antennas.
• Tests
conducted using the
latest chipsets
demonstrate that the
performance
achieved by using
Turbo Codes may be
even lower than the
0.8 dB figure
assumed in early
designs.
Error-correcting codes can be divided into block codes and convolutional codes. Other
block error-correcting codes, such as Reed-Solomon codes, transform a chunk of bits into
a (longer) chunk of bits in such a way that errors up to some threshold in each block can
be detected and corrected.
However, in practice errors often occur in bursts rather than at random. This is often
compensated for by shuffling (interleaving) the bits in the message after coding. Then
any burst of bit-errors is broken up into a set of scattered single-bit errors when the bits of
the message are unshuffled (de-interleaved) before being decoded.