0% found this document useful (0 votes)
6 views8 pages

Information theory

Information theory is a mathematical framework for understanding the transmission and processing of information, primarily through Shannon's communication model, which includes components like the message source, encoder, channel, noise, decoder, and message receiver. It distinguishes between discrete and continuous signals and examines communication in noisy and noiseless environments, emphasizing the importance of error correction. Applications of information theory include data compression, error detection, cryptology, and the quantification of information leakage in security analysis.

Uploaded by

dwivedialok
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views8 pages

Information theory

Information theory is a mathematical framework for understanding the transmission and processing of information, primarily through Shannon's communication model, which includes components like the message source, encoder, channel, noise, decoder, and message receiver. It distinguishes between discrete and continuous signals and examines communication in noisy and noiseless environments, emphasizing the importance of error correction. Applications of information theory include data compression, error detection, cryptology, and the quantification of information leakage in security analysis.

Uploaded by

dwivedialok
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 8

information theory

Information theory, a mathematical representation of the conditions


and parameters affecting the transmission and processing of information.

Shannon’s communication model

Shannon’s theory deals primarily with the encoder, channel, noise source, and
decoder. As noted above, the focus of the theory is on signals and how they can be
transmitted accurately and efficiently

Shannon developed a very simple, abstract model of communication, as shown in


the figure. Because his model is abstract, it applies in many situations, which
contributes to its broad scope and power.

The first component of the model, the message source, is simply the entity that
originally creates the message. Often the message source is a human, but in
Shannon’s model it could also be an animal, a computer, or some other inanimate
object. The encoder is the object that connects the message to the actual physical
signals that are being sent.

For example, there are several ways to apply this model to two people having a
telephone conversation. On one level, the actual speech produced by one person
can be considered the message, and the telephone mouthpiece and its associated
electronics can be considered the encoder, which converts the speech into electrical
signals that travel along the telephone network.

The channel is the medium that carries the message. The channel might be wires,
the air or space in the case of radio and television transmissions, or fibre-optic
cable. In the case of a signal produced simply by banging on the plumbing, the
channel might be the pipe that receives the blow.
Noise is anything that interferes with the transmission of a signal. In telephone
conversations interference might be caused by static in the line, cross talk from
another line, or background sounds. Signals transmitted optically through the air
might suffer interference from clouds or excessive humidity. Clearly, sources of
noise depend upon the particular communication system. A single system may
have several sources of noise, but, if all of these separate sources are understood, it
will sometimes be possible to treat them as a single source.

The decoder is the object that converts the signal, as received, into a form that the
message receiver can comprehend. In the case of the telephone, the decoder could
be the earpiece and its electronic circuits. Depending upon perspective, the decoder
could also include the listener’s entire hearing system.

The message receiver is the object that gets the message. It could be a person, an
animal, or a computer or some other inanimate object.

Four types of communication

There are two fundamentally different ways to transmit messages:

discrete signals

continuous signals.

Discrete signals can represent only a finite number of different, recognizable states.
For example, the letters of the English alphabet are commonly thought of as
discrete signals.

Continuous signals, also known as analog signals, are commonly used to transmit
quantities that can vary over an infinite set of values—sound is a typical example.
However, such continuous quantities can be approximated by discrete signals—for
instance, on a digital compact disc or through a digital telecommunication system
—by increasing the number of distinct discrete values available until any
inaccuracy in the description falls below the level of perception or interest.

Communication can also take place in the presence or absence of noise. These
conditions are referred to as noisy or noiseless communication, respectively.
There are four cases to consider:

discrete, noiseless communication;


discrete, noisy communication;
continuous, noiseless communication; and
Continuous, noisy communication.
It is easier to analyze the discrete cases than the continuous cases; likewise, the
noiseless cases are simpler than the noisy cases. Therefore, the discrete, noiseless
case will be considered.

Discrete, noiseless communication

From message alphabet to signal alphabet


The English alphabet is a discrete communication system. It consists of a
finite set of characters, such as uppercase and lowercase letters, digits, and various
punctuation marks. Messages are composed by stringing these individual
characters together appropriately.

For noiseless communications, the decoder at the receiving end receives exactly
the characters sent by the encoder. However, these transmitted characters are
typically not in the original message’s alphabet

Discrete, noisy communication

In the real world, however, transmission errors are unavoidable—especially given


the presence in any communication channel of noise, which is the sum total of
random signals that interfere with the communication signal. In order to take
the inevitable transmission errors of the real world into account, some adjustment
in encoding schemes is necessary
.

The figure shows a simple model of transmission in the presence of noise,


the binary symmetric channel. Binary indicates that this channel transmits only two
distinct characters, generally interpreted as 0 and 1, while symmetric indicates that
errors are equally probable regardless of which character is transmitted. The
probability that a character is transmitted without error is labeled p; hence, the
probability of error is 1 − p.

Consider what happens as zeros and ones, hereafter referred to as bits, emerge
from the receiving end of the channel. Ideally, there would be a means of
determining which bits were received correctly. In that case, it is possible to
imagine two printouts:

10110101010010011001010011101101000010100101—Signal

00000000000100000000100000000010000000011001—Errors

Signal is the message as received, while each 1 in Errors indicates a mistake in the
corresponding Signal bit. (Errors itself is assumed to be error-free.)

Shannon showed that the best method for transmitting error corrections requires an
average length of
E = p log2(1/p) + (1 − p) log2(1/(1 − p))
bits per error correction symbol. Thus, for every bit transmitted at least E bits have
to be reserved for error corrections.

Applications of information theory


 Data compression
 Error-correcting and error-detecting codes
 Cryptology : Cryptology is the science of secure communication. It
concerns both cryptanalysis, the study of how encrypted information
is revealed (or decrypted) when the secret “key” is unknown, and
cryptography, the study of how information is concealed and
encrypted in the first place.

Random variables and probability distributions

A random variable is a numerical description of the


outcome of a statistical experiment. A random variable
that may assume only a finite number or
an infinite sequence of values is said to be discrete; one
that may assume any value in some interval on the real
number line is said to be continuous. For instance, a
random variable representing the number of automobiles
sold at a particular dealership on one day would be
discrete, while a random variable representing the
weight of a person in kilograms (or pounds) would be
continuous.
The probability distribution for a random variable
describes how the probabilities are distributed over the
values of the random variable.
For a discrete random variable, x, the probability
distribution is defined by a probability mass
function, denoted by f(x). This function provides the
probability for each value of the random variable. In the
development of the probability function for a discrete
random variable, two conditions must be satisfied:
(1) f(x) must be nonnegative for each value of the
random variable, and (2) the sum of the probabilities for
each value of the random variable must equal one.
A continuous random variable may assume any value in
an interval on the real number line or in a collection of
intervals. Since there are an infinite number of values in
any interval, it is not meaningful to talk about the
probability that the random variable will take on a
specific value; instead, the probability that a continuous
random variable will lie within a given interval is
considered.
The expected value, or mean, of a random variable—
denoted by E(x) or μ—is a weighted average of the values
the random variable may assume. In the discrete case
the weights are given by the probability mass function,
and in the continuous case the weights are given by the
probability density function. The formulas for computing
the expected values of discrete and continuous random
variables are given by equations 2 and 3, respectively.

E(x) = Σxf(x) (2)

E(x) = ∫xf(x)dx (3)

Entropy is a measure of a random variable’s uncertainty


or the amount of information required to describe a
variable. Entropy is the measure of the average
information content. The higher the entropy, the higher
the entropy, and the more information that feature
contributes.
Info Entropy (H) can be written as:
Where,
X – Discrete random variable X
P(xi) – Probability mass function
Quantification of information leakage
Quantification of information leakage is a recent
technique in security analysis that evaluates the amount
of information about a secret (for instance about a
password) that can be inferred by observing a system.
Leakage = difference in the uncertainty about the secret
h before and after observations O on the system
H(h) − H(h|O) = I(h; O) (mutual information)
Lower Bound on Key Size:
key size or key length refers to the number of bits in
a key used by a cryptographic algorithm.
Key length defines the upper-bound on an
algorithm's security (i.e. a logarithmic measure of the
fastest known attack against an algorithm), because the
security of all algorithms can be violated by brute-force
attacks. Ideally, the lower-bound on an algorithm's
security is by design equal to the key length (that is, the
algorithm's design does not detract from the degree of
security inherent in the key length).
Most symmetric-key algorithms are designed to have
security equal to their key length.
Secrete Authentication and Secrete Sharing:
Secret authentication information is a gateway to access
valuable assets. It typically includes passwords,
encryption keys etc. so needs to be controlled through a
formal management process and needs to be kept
confidential to the user.
Secret sharing (also called secret splitting) refers to
methods for distributing a secret among a group, in such
a way that no individual holds any intelligible
information about the secret, but when a sufficient
number of individuals combine their 'shares', the secret
may be reconstructed. Whereas insecure secret sharing
allows an attacker to gain more information with each
share, secure secret sharing is 'all or nothing' (where
'all' means the necessary number of shares).

Computationally Secure Cipher:

A "computationally secure cipher" is a type of


encryption technique in which the cost of cracking the
secure cipher is greater than the value of the
information being encrypted, and the time required to
crack the cipher is longer than the useful lifetime of the
information.

You might also like