An Application of Vector Space Theory in Data Tran
An Application of Vector Space Theory in Data Tran
net/publication/220613378
CITATIONS READS
0 6,168
1 author:
Liejune Shiau
University of Houston - Clear Lake
25 PUBLICATIONS 377 CITATIONS
SEE PROFILE
All content following this page was uploaded by Liejune Shiau on 24 March 2014.
LieJune Shiau
University of Houston-Clear Lake
2700 Bay Area Boulevard
Houston, Texas 77058 USA
[email protected]
Abstract
This work illustrates an application of vector spaces to data transmission theory. We show how Hamming code error
detection and error correction are done through the tool of various theories in vector space. It is hoped that this article
will explain the importance of abstract mathematics, such as vector space and basis, in the application of data
transmission, which enlightens mathematics and computer science majors.
1. Introduction
The purpose of this article is to introduce the reader to an Sending Coded Message
interesting real world application of vector spaces. We
Binary Message Encoder
wish to exhibit one of the many strong links between
abstract and applied mathematics. The ubiquity of the Noise Channel
Internet makes data transmission a particularly relevant
“real world” problem. Computer data is transmitted in a Binary Message Decoder
binary mode and hence is represented in strings of 0’s and
Receiving Coded Message
1’s called bits. When transmitting from a home computer
via a modem, data is susceptible to “noise” corruption by Figure 1: Data Communication Model
an electromotive force, or to loss of signal due to
attenuation. Corruption of the binary signal simply means
some 0’s are changed into 1’s and vice versa. To In spite of the obvious limitations, the single parity
counteract such corruption Engineers utilize data encoders check bit scheme is often use in low speed, non-critical
and decoders. The diagram in Figure 1 shows a typical applications. When a higher degree of accuracy is needed,
digital data communication system. engineers turn to Hamming Code Theory to find and
The most basic method of error detection utilizes a correct errors. The Hamming Code was devised by
single-parity-check-bit scheme. An encoder simply Richard Hamming in late 1940’s and is currently the
attaches an extra binary bit, called a parity check bit, to the predominant error detection and correction method in use.
end of the message. The parity check bit can be computed What we propose to explain the process by which
the sum of all message bits (even parity check). When the Hamming Code error correction finds and corrects a single
decoder receives a string, it computes the binary sum of all transmission error. It will be seen that vector spaces
bits, including the parity check bit. If there are an odd provide the natural language in which to explain the
number of errors, the sum will be 1; if there are no errors, process.
or an even number of errors, the sum will be 0. The Hamming Code utilizes the idea of multiple parity
weakness of this method is that only an odd number of check bits. The sole purpose of the parity check bits is to
errors can be detected, and the location of the error remains detect the location of transmission errors in the message.
unknown. This means we cannot fix the errors, and instead Since we are encoding data in a binary fashion, any
the decoder must request a re-send of the data. corrupted bit is simply corrected by replacing the value
with its complement (i.e. 0’s are replaced with 1’s and vice
versa.) A major concern in the design of the encoding
scheme is that the parity check bits will provide the
n
For any element x, y ∈ ∧ 2 , the Hamming distance between 2. Code Words in Code Space
x and y, denoted d(x, y), is the number of components
2.1 Encoding Messages
where xi ≠ yi , for i = 1,...,n.
To encode a message means to convert the message from
In our data transmission example, notice that d(x, y) ≥ 4
3 for any (x, y) pair in the code space C7, 4. This means that ∧ 2 to a vector in code space C7,4. We define the converting
any two code words differ by at least 3 bits. In other 4
function F : ∧ 2 → C7,4, as
words, our 16 code words are all at least a Hamming
distance of 3 units apart. This also means that when
F(x1, x2, x3, x4) = F(x1e1 + x2e2 + x3e3 + x4e4)
exactly one bit has been mis-transmitted, the corrupted
= x1g1 + x2g2 + x3g3 + x4g4,
string xp can be corrected to the closest code word with no
ambiguity, since there is only one correct code word xc
where {еi}{i=1,...,4} with the ith component being 1 and all
with d(xp , xc) = 1. When two or more errors occur in 4
transmission, we may possibly detect the existence of others being 0 is the standard basis for ∧ 2 , and {g1, g2, g3,
errors but cannot correct them, because when d(xp , xc) ≥ 2, g4} is a basis for C7,4. Since x1g1 + x2g2 + x3g3 + x4g4= (x1,
there is a strong possibility that the corrupted string is x2, x3, x4) · G, the generator matrix G is also a matrix
actually closer to an incorrect code word. In this case, an induced by the converting function. For example, to
incorrect code word will mistakenly be assigned to correct encode (0, 1, 1, 1), function F converts (0, 1, 1, 1) = e2 + e3
xp. Note that the vector space C7,4 along with the Hamming + e4 to g2 + g3 + g4 = (0, 1, 1, 1, 0, 0, 1) which is the same
distance metric defined on it, give us a metric space in as converting (0, 1, 1, 1) through G by using (0,1,1,1) · G.
which to work. Now the coded message has 3 attached parity check bits.
Since C7, 4 is a vector space, without too much
difficulty, we can compute a basis for C7, 4 (i.e. N.S.(H)) 2.2 Decoding Messages
and place it in row vector form in a matrix, G, as follows: To decode a message means to check the message to
determine if there has been an error, and if so, correct the
1 0 0 0 1 1 0 error and extract the original message. The parity check
0 1 0 0 1 0 1 matrix H will determine if the message is received
G= correctly. Again here we assume there is at most one bit
0 0 1 0 0 1 1
error. For example when a message is received as
0 0 0 1 1 1 1
0
This means R.S.(G) = N.S.(H) (i.e. the Row Space of G 1
equals to the Null Space of H). Since rank(H) = 3, the
dimension of N.S.(H) = 7 – 3 = 4 = rank(G). The code 1
space C7, 4 generated by G is thus a 4-dimensional subspace x = 1
7
in the 7-dimensional vector space ∧ 2 . We call G the 0
generator matrix of the code space C7, 4. Since the set of 0
row vectors, {g1, g2, g3, g4}, of G form a basis for the 1
code space C7, 4, any code word in C7, 4 can be expressed as
a linear combination of gi’s. We also see that the rows of the fact that
the generator matrix G and the rows of parity check matrix
H are orthogonal to each other by observing that is G·HT =
0
0. Hence N.S.(G) = R.S.(H) = C.S.(HT) (i.e. Column Space
of HT). The column space of H, C.S.(H), provides useful Hx = 0
information regarding the position of the error. For any 0
received string xr, the product of the parity check matrix H
and xr is called the syndrome vector. We see that any indicates x ∈ N.S.(H), which means x was received
3
syndrome vector Hxr ∈ ∧ 2 = C.S.(H) and every non-zero correctly. Therefore, the source message is (0, 1, 1, 1)
3 7 which is x without the attached parity check bits.
vector of ∧ 2 is a column of H. Thus if xr ∈ ∧ 2 but xr ∉
C7,4 (i.e. xr is a string, but not a code word), then Hxr ≠ 0. 2.3 Error Detecting and Correcting
Therefore, in order for xr to be correct, Hxr must be in the If a message is received as
column space of H. We call C.S.(H) the syndrome space
generated by H and xr. We will discuss the details of
syndrome decoding as an alternative decoding method.
References
[1] Berlekamp , E., Algebraic Coding Theory, McGraw-Hill Book Company, 1968.
[2] Hill, R., Elementary Linear Algebra with Applications, Saunders College publishing, 1996.
[3] Lin, S., An Introduction to Error-Correcting Codes, Prentice-Hall, 1970.