Characterising Probability Distributions via Entropies

This paper addresses the challenge of characterizing the capacity region for networks with dependent sources using entropy functions. It demonstrates how auxiliary random variables can be employed to better capture correlations among sources, leading to tighter outer bounds for network coding problems. The authors provide theoretical results and examples to illustrate the effectiveness of their approach in determining probability distributions from entropies.

Uploaded by

daoodsaleem

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views

Characterising Probability Distributions via Entropies

Uploaded by

daoodsaleem

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

ISITA2016, Monterey, California, USA, October 30-November 2, 2016

Characterising Probability Distributions via

Entropies
Satyajit Thakor† , Terence Chan‡ and Alex Grant∗
Indian Institute of Technology Mandi†
University of South Australia‡
Myriota Pty Ltd∗

Abstract—Characterising the capacity region for a network can multicast sessions, and {Ys : s ∈ S} be the set of source
be extremely difficult, especially when the sources are dependent. random variables. These sources are available at the nodes
Most existing computable outer bounds are relaxations of the identified by the mapping (a source may be available at
Linear Programming bound. One main challenge to extend linear
program bounds to the case of correlated sources is the difficulty multiple nodes) a : S → 2V . Similarly, each source may be
(or impossibility) of characterising arbitrary dependencies via demanded by multiple sink nodes, identified by the mapping
entropy functions. This paper tackles the problem by addressing b : S → 2V . For all s assume that a(s) ∩ b(s) = ∅. Each edge
how to use entropy functions to characterise correlation among e ∈ E carries a random variable Ue which is a function of
sources. incident edge random variables and source random variables.
I. I NTRODUCTION Sources are i.i.d. sequences {(Ysn , s ∈ S), n = 1, 2, . . . , }.
Hence, each (Ysn , s ∈ S) has the same joint distribution, and
This paper begins with a very simple and well known result. is independent across different n. For notation simplicity, we
Consider a binary random variable X such that will use (Ys , s ∈ S) to denote a generic copy of the sources
pX (0) = p and pX (1) = 1 − p. at any particular time instance. However, within the same
“time” instance n, the random variables (Ysn , s ∈ S) may
While the entropy of X does not determine exactly what the be correlated. We assume that the distribution of (Ys , s ∈ S)
probabilities of X are, it essentially determines the probability is known.
distribution (up to relabelling). To be precise, let 0 ≤ q ≤ 1/2 Roughly speaking, a link capacity tuple C = (Ce : e ∈ E)
such that H(X) = hb (q) where is achievable if one can design a network coding solution to
hb (q) −q log q − (1 − q) log(1 − q). transmit the sources {(Ysn , s ∈ S), n = 1, 2, . . . , } to their
respective destinations such that 1) the probability of decoding
Then either p = q or p = 1 − q. Furthermore, the two possible error is vanishing (as n goes to infinity), and 2) the number
distributions can be obtained from each other by renaming of bits transmitted on the link e ∈ E is at most nCe . The set
the random variable outcomes appropriately. In other words, of all achievable link capacity tuples is denoted by R.
there is a one-to-one correspondence between entropies and Theorem 1 (Outer bound [2]): For a given network, consider
distribution (when the random variable is binary). the set of correlated sources (Ys , s ∈ S) with underlying prob-
The basic question now is: How “accurate” can entropies ability distribution PYS (·). Construct any auxiliary random
specify the distribution of random variables? When X is not variables (Ki , i ∈ L) by choosing a conditional probability
binary, the entropy H(X) alone is not sufficient to characterise distribution function PKL |YS (·). Let R be the set of all link
the probability distribution of X. In [1], it was proved that capacity tuples C = (Ce : e ∈ E) such that there exists a
if X is a random scalar variable, its distribution can still polymatroid h satisfying the following constraints
be determined by using auxiliary random variables subject
h(XW , JZ ) − H(YW , KZ ) = 0 (1)
to alphabet cardinality constraint. The results can also be
extended to random vector if the distribution is positive. h(Ue |Xs : a(s) → e, Uf : f → e) = 0 (2)
However, the proposed approach cannot be generalised to the h(Ys : u ∈ b(s)|Xs : u ∈ a(s ), Ue : e → u) = 0 (3)
case when the distribution is not positive. Main contributions: Ce − h(Ue ) ≥ 0 (4)
In this paper, we take a different approach and generalise the
result to any random vectors. Before we continue answering for all W ⊆ S, Z ⊆ L, e ∈ E, u ∈ b(s) and s ∈ S. Then
the question, we will briefly describe an application (based on
R ⊆ R (5)
network coding problems) of characterising distributions (and
correlations) among random variables by using entropies. where the notation x → y means x is incident to y and x, y
Let the directed acyclic graph G = (V, E) serve as a can be an edge or a node.
simplified model of a communication network with error-free Remark 1: The region R will depend on how we choose
point-to-point communication links. Edges e ∈ E have finite the auxiliary random variables (Ki , i ∈ L). In the following,
capacity Ce > 0. Let S be an index set for a number of we give an example to illustrate this fact.

Authorized licensed use limited to: Centrale Supelec. Downloaded on November 07,2024 at 09:50:44 UTC from IEEE Xplore. Restrictions apply.
Copyright (C) 2016 by IEICE 453
ISITA2016, Monterey, California, USA, October 30-November 2, 2016

Consider the following network coding problem depicted Let us further assume that pX (i) ≤ 1/2 for all i. Then by
in Figure 1, in which three correlated sources Y1 , Y2 , Y3 are (8) and strict monotonicity of hb (q) in the interval [0, 1/2], it
available at node 1 and are demanded at nodes 3, 4, 5 respec- seems at the first glance that the distribution of X is uniquely
tively. Here, Y1 , Y2 , Y3 are defined such that Y1 = (b0 , b1 ), specified by the entropies of the auxiliary random variables.
Y2 = (b0 , b2 ) and Y3 = (b1 , b2 ) for some independent However, there is a catch in the argument – The auxiliary
and uniformly distributed binary random variables b0 , b1 , b2 . random variables chosen are not arbitrary. When we “com-
Furthermore, the edges from node 2 to nodes 3, 4, 5 have pute” the probabilities of X from the entropies of the auxiliary
sufficient capacity to carry the random variable U1 available at random variables, it is assumed to know how the random
node 2. We consider two outer bounds obtained from Theorem variables are constructed. Without knowing the “construction”,
Y 1 Y 2 Y3 it is unclear how to find the probabilities of X from entropies.
1 More precisely, suppose we only know that there exists
U1
auxiliary random variables A1 , A2 , A3 such that (7) and (8)
2 U2 U3 U4 hold (without knowing that the random variables are speci-
U1
fied by (6)). Then we cannot determine precisely what the
U1 distribution of X is. Despite this complexity, [1], [2] showed
U1
3 4 5
Y3
a construction of auxiliary random variables from which the
Y1 Y2
probability distribution can be characterised from entropies
Fig. 1. A network example [2]. (see [4] for detailed proofs). The results will also be briefly
restated as a necessary prerequisite for the vector case.
1 for the above network coding problem. In the first scenario, Let X be a random variable with support Nn = {1, . . . , n}
we use no auxiliary random variables, while in the second and Ω be the set of all nonempty binary partitions of Nn . In
scenario, we use three auxiliary random variables such that other words, Ω is the collection of all sets {α, αc } such that
K 0 = b 0 , K1 = b 1 , K2 = b2 . α ⊆ Nn , and both |α| and |αc | are nonzero. We will use α to
denote the set {α, αc }. To simplify notations, we may assume
Let Ri be respectively the outer bounds for the two scenarios. without loss of generality that α is a subset of {2, . . . , n}.
Then R2 is a proper subset of R1 . In particular, the link Clearly, |Ω| = 2n−1 − 1. Unless explicitly stated otherwise,
capacity tuple (Ce = 1, e = 1, ..., 4) is in the region R1 \ R2 we may assume without loss of generality that the probability
[2]. This example shows that by properly choosing auxiliary that X = i (denoted by pi ) is monotonic decreasing. In other
random variables, one can better capture the correlations words,
among the sources, leading to a strictly tighter/better outer p1 ≥ . . . ≥ pn > 0.
bound for network coding. Construction of auxiliary random
variables from source correlation was also considered in [3] Definition 1 (Partition Random Variables): A random vari-
to improve cut-set bounds. able X with support Nn induces 2n−1 − 1 random variables
Aα for α ∈ Ω such that
II. M AIN RESULTS
α if X ∈ α
In this section, we will show that by using auxiliary random Aα (9)
αc otherwise.
variables, the probability distribution of a set of random
variables (or a random vector) can be uniquely characterised We called {Aα , α ∈ Ω} the collection of binary partition
from the entropies of these variables. random variables of X.
Remark 2: If |α| = 1 or n − 1, then there exists an element
A. Random Scalar Case
i ∈ X such that Aα = {i} if and only if X = i. Hence, Aα
Consider any ternary random variable X. Clearly, entropies is essentially a binary variable indicating/detecting whether
of X and probability distributions are not in one-to-one X = i or not. As such, we call Aα an indicator variable.
correspondence. In [1], auxiliary random variables are used Furthermore, when n ≥ 3, there are exactly n indicator
in order to exactly characterise the distribution. variables, one for each element in Nn .
Suppose X is ternary, taking values from the set {1, 2, 3}. Theorem 2 (Random Scalar Case): Suppose X is a random
Suppose also that pX (x) > 0 for all x ∈ {1, 2, 3}. Define variable with support Nn . For any α ∈ Ω, let Aα be
random variables A1 , A2 and A3 such that the corresponding binary partition random variables. Now,

1 if X = i suppose X ∗ is another random variable such that 1) the size
Ai = (6) of its support X ∗ is at most the same as that of X, and 2)
0 otherwise.
there exists random variables (Bα , α ∈ Ω) satisfying the
Clearly, following conditions:
H(Ai |X) = 0, (7) H(Bα , α ∈ Δ) = H(Aα , α ∈ Δ) (10)
H(Ai ) = hb (pX (i)). (8) H(Bα |X ∗ ) = 0 (11)

Authorized licensed use limited to: Centrale Supelec. Downloaded on November 07,2024 at 09:50:44 UTC from IEEE Xplore. Restrictions apply.
Copyright (C) 2016 by IEICE 454
ISITA2016, Monterey, California, USA, October 30-November 2, 2016

TABLE I such that for any subset Δ of Ω and τ ⊆ {1, . . . , M },

P ROBABILITY DISTRIBUTIONS OF X AND X∗
H(Bα , α ∈ Δ, Xj∗ , j ∈ τ )
X2
1 2 3 4 = H(Aα , α ∈ Δ, Xj , j ∈ τ ). (13)
a 1/8 1/8 0 0 Then the joint probability distributions of X = (X1 , . . . , XM )
X1 b 1/8 1/8 0 0 and X ∗ = (X1∗ , . . . , XM
∗
) are essentially the same. More pre-
c 0 0 1/8 1/8 cisely, there exists bijective mappings σm for m = 1, . . . , M
d 0 0 1/8 1/8 such that

X2∗ Pr(X = (x1 , . . . , xM ))

1 2 3 4 = Pr(X ∗ = (σ1 (x1 ), . . . , σM (xM ))). (14)
a 1/8 1/8 0 0
X1∗ b 0 1/8 1/8 0 Proof: See Appendix B.
c 0 0 1/8 1/8
C. Application: Network coding outer bound
d 1/8 0 0 1/8
Together with Theorem 1 and the characterisation of distri-
bution using entropies, we obtain the following outer bound
for all Δ ⊆ Ω. Then there is a mapping on R.
Corollary 1: For any given network, consider the set of
σ : Nn → X ∗ correlated sources (Ys , s ∈ S) with underlying probability
distribution PYS (·). From this distribution, construct binary
such that Pr(X = i) = Pr(X ∗ = σ(i)). In other words, the
partition random variables Aα as described in Theorem 3
probability distributions of X and X ∗ are essentially the same
(via renaming outcomes). (for vector case). Let R (Γ∗ ) be the set of all link capacity
tuples C = (Ce : e ∈ E) such that there exists a polymatroid
Proof: A sketch of the proof is shown in Appendix A.
function h satisfying the constraints (2)-(4) and

B. Random Vector Case h(Xs , s ∈ W, Bα , α ∈ Δ)

= H(Ys , s ∈ W, Aα , α ∈ Δ) (15)
Extension of Theorem 2 to the case of random vector
has also been considered briefly in our previous work [1]. for all W ⊆ S, Δ ⊆ Ω, e ∈ E, u ∈ b(s) and s ∈ S. Then
However, the extension is fairly limited in that work – the R ⊆ R .
random vector must have a positive probability distribution
and each individual random variable must take at least three III. C ONCLUSION
possible values. In this paper, we overcome these restrictions In this paper, we showed that by using auxiliary random
and fully generalise Theorem 2 to the random vector case. variables, entropies are sufficient to uniquely characterise the
Example 1: Consider two random vectors X = (X1 , X2 ) probability distribution of a random vector (up to outcome
and X ∗ = (X1∗ , X2∗ ) with probability distributions given in relabelling). Yet, there are still many open questions remained
Table I. If we compare the joint probability distributions of to be answered. For example, the number of auxiliary random
X and X ∗ , they are different from each other. Yet, if we variables used are exponential to the size of the support. Can
treat X and X ∗ as scalars (by properly renaming), then they we reduce the number of auxiliary random variables? What is
indeed have the same distribution (both uniformly distributed the tradeoff between the number of auxiliary variables used
over a support of size 8). This example shows that we cannot and the quality of how well entropies can characterise the
directly apply Theorem 2 to the random vector case, by simply distribution? To the extreme, if only one auxiliary random
mapping a vector into a scalar. variable can be used, how can one pick the variable to best
Theorem 3 (Random Vector): Suppose X = (X1 , . . . , XM ) describe the distribution?
is a random vector with support X of size at least 3. Again,
let Ω be the set of all nonempty binary partitions of X and R EFERENCES
Aα be the binary partition random variable of X such that [1] S. Thakor, T. Chan, and A. Grant, “Characterising correlation via entropy
functions,” in Information Theory Workshop (ITW), 2013 IEEE, pp. 1–2,
α if X ∈ α Sept 2013 (invited paper).
Aα = (12)
αc otherwise [2] S. Thakor, T. Chan, and A. Grant, “Bounds for network information flow
with correlated sources,” in Australian Communications Theory Workshop
for all α ∈ Ω. (AusCTW), pp. 43 –48, Feb. 2011.
[3] A. Gohari, S. Yang, and S. Jaggi, “Beyond the cut-set bound: Uncertainty
Now, suppose X ∗ = (X1∗ , . . . , XM
∗
) is another random computations in network coding with correlated sources,” IEEE Trans.
vector where there exists random variables Inform. Theory, vol. 59, pp. 5708–5722, Sept 2013.
[4] S. Thakor, T. Chan, and A. J. Grant, “On the capacity of networks with
(Bα , α ∈ Ω) correlated sources,” CoRR, vol. abs/1309.1517, 2013.

Authorized licensed use limited to: Centrale Supelec. Downloaded on November 07,2024 at 09:50:44 UTC from IEEE Xplore. Restrictions apply.
Copyright (C) 2016 by IEICE 455
ISITA2016, Monterey, California, USA, October 30-November 2, 2016

A PPELDIX A - S CALAR C ASE ordinary partition random variables. To prove the theorem, our
The main ingredients in the proofs for Theorems 2 and 3 first immediate goal is to prove that those random variables
are the properties of the partition random variables, which will Bα are indeed binary partition random variables. In partic-
be reviewed as follows. By understanding the properties, we ular, we can prove that
can better understand the logic behind Theorem 2. 1) (Distinctness) All the random variables Bα for α ∈
Lemma 1 (Properties): Let X be a random variable with Ω are distinct and have non-zero entropies.
support Nn , and (Aα , α ∈ Ω) be its induced binary partition 2) (Basis) Let α ∈ Ω. Then there exists
random variables. Then the following properties hold: β1 , . . . , βn−2 ∈ Ω such that
1) (Distinctness) For any α = β , H(Bβk |Bα , Bβ1 , . . . , Bβk−1 ) > 0 (21)
H(Aα |Aβ ) > 0, (16) for all k = 1, . . . , n − 2.
H(Aβ |Aα ) > 0. (17) 3) (Binary properties) For any α ∈ Ω, Bα is a bi-
nary partition random variable of X ∗ . In this case, we
2) (Completetness) Let A∗ be a binary random variable
may assume without loss of generality that there exists
such that H(A∗ |X) = 0 and H(A∗ ) > 0. Then there
ωα ⊆ X ∗ such that
exists α ∈ Ω such that

ωα if X ∗ ∈ ωα
H(A∗ |Aα ) = H(Aα |A∗ ) = 0. (18) Bα = c (22)
ωα otherwise.
∗
In other words, Aα and A are essentially the same.
4) (Completetness) Let B ∗ be a binary partition random
3) (Basis) Let α ∈ Ω. Then there exists
variable of X ∗ with non-zero entropy. Then there exists
β1 , . . . , βn−2 ∈ Ω α ∈ Ω such that
such that H(B ∗ |Bα ) = H(Bα |B ∗ ) = 0. (23)
H(Aβk |Aα , Aβ1 , . . . , Aβk−1 ) > 0 (19) Then by (10) – (11) and Proposition 1, we show that Bα
satisfies all properties which are only satisfied by the indicator
for all k = 1, . . . , n − 2.
random variables. Thus, we prove that Bα is an indicator
Among all binary partition random variables, we are par- variable if |α| = 1. Finally, once we have determined which
ticularly interested in those indicator random variables. The are the indicator variables, we can immediately determine the
following proposition can be interpreted as “entropic charac- probability distribution. As H(Aα ) = H(Bα ) for all α ∈
terisation” for those indicator random variables. Ω, the distribution of X ∗ is indeed the same as that of X
Proposition 1 (Characterising indicators): Let X be a (subject to relabelling).
random variable of support Nn where n ≥ 3. Consider the
A PPENDIX B - V ECTOR CASE
binary partition random variables induced by X. Then for all
In this appendix, we will sketch the proof for Theorem 3,
i ≥ 2,
which extends Theorem 2 to the random vector case.
1) H(Ai |Aj , j > i) > 0, and Consider a random vector X = (Xm : m ∈ NM ). We will
2) For all α ∈ Ω such that H(Aα |Aj , j > i) > 0, only consider the general case1 where the support size of X
H(Ai ) ≤ H(Aα ). (20) is at least 3, i.e., S(Xm : m ∈ NM ) ≥ 3.
Let X be the support of X. Hence, elements of X is of the
3) Equalities (20) hold if and only if Aα is an indicator form x = (x1 , . . . , xM ) such that
random variable detecting an element ∈ Nn such that
Pr(Xm = xm , m ∈ NM ) > 0
p = pi .
if and only if x ∈ X .
4) Let β ⊆ {2. . . . , n}. The indicator random variable A1 The collection of binary partition random variables induced
is the only binary partition variable of X such that by the random vector X = (Xm , m ∈ NM ) is again indexed
by (Aα , α ∈ Ω). As before, we may assume without loss
H(Aα |Aj , j ∈ β) > 0
of generality that
for all proper subset β of {2. . . . , n}.
α if X ∈ α
Aα = (24)
Sketch of Proof for Theorem 2: Let X be a random αc otherwise.
scalar and Aα for α ∈ Ω are its induced partition random
variables. Suppose X ∗ is another random variable such that Now, suppose (Bα , α ∈ Ω) is a set of random variables
1) the size of its support X ∗ is at most the same as that of X, satisfying the properties as specified in Theorem 3. Invoking
and 2) there exists random variables (Bα , α ∈ Ω) satisfying Theorem 2 (by treating the random vector X ∗ as one discrete
(10) and (11). variable), we can prove the following.
Roughly speaking, (10) and (11) mean that the set of 1 In the special case when the support size of X is less than 3, the theorem
random variables (Bα , α ∈ Ω) satisfy most properties as can be proved directly.

Authorized licensed use limited to: Centrale Supelec. Downloaded on November 07,2024 at 09:50:44 UTC from IEEE Xplore. Restrictions apply.
Copyright (C) 2016 by IEICE 456
ISITA2016, Monterey, California, USA, October 30-November 2, 2016

1) The size of the support of X ∗ and X are the same. 2) For any γ ⊆ αc , H(A∗σ(α) |A∗σ(x) , x ∈ γ) = 0 if and
2) Bα is a binary partition variable for all α ∈ Ω. only if γ = αc .
3) The set of variables (Bα , α ∈ Ω) contains all distinct The above two properties can then be rephrased as
binary partition random variables induced by X ∗ . 1) For any δ(γ) ⊆ δ(α),
4) Bx is an indicator variable for all x ∈ X .
According to definition, Ax is defined as an indicator H(A∗σ(α) |A∗σ(x) , σ(x) ∈ δ(γ)) = 0
variable for detecting x. However, while Bx is an indicator if and only if δ(γ) = δ(α)
variable, the subscript x in Bx is only an index. The element 2) For any δ(γ) ⊆ δ(αc ),
detected by Bx can be any element in the support of X ∗ ,
which can be completely different from X . To highlight the H(A∗σ(α) |A∗σ(x) , σ(x) ∈ δ(γ)) = 0
difference, we define the mapping σ such that for any x ∈ X , if and only if δ(γ) = δ(αc ).
σ(x) is the element in the support of X ∗ that is detected Now, we can invoke Proposition 2 and prove that A∗δ(α) =
by Bx . In other words A∗σ(x) = Bx . The lemma below A∗σ(α) or equivalently, δ(α) = σ(α) . The proposition
follows from Theorem 2. then follows.
Lemma 2: For all x ∈ X , Proposition 5: Consider two distinct elements x =
Pr(X = x) = Pr(X ∗ = σ(x)). (x1 , . . . , xM ) and x = (x1 , . . . , xM ) in X . Let

Let X ∗ be the support of X ∗ . We similarly deﬁne Ω∗ as the σ(x) = y = (y1 , . . . , yM ) (26)

collection of all sets of the form {γ, γ c } where γ is a subset σ(x ) = y = (y1 , . . . , yM

). (27)
of X ∗ and the sizes of γ and γ c are non-zero. Again, we will
use γ to denote the set {γ, γ c } and define Then xm = xm if and only if ym = ym
.
Proof: First, we will prove the only-if statement. Suppose
∗ γ if X ∗ ∈ γ xm = xm . Consider the following two sets
Aγ = c (25)
γ otherwise.
Δ = {x = (x1 , . . . , xM ) ∈ X : xm = xm }, (28)
For any α ∈ Ω, Bα is a binary partition random variable
Δc = {x = (x1 , . . . , xM ) ∈ X : xm = xm }. (29)
of X ∗ . Hence, we may assume without loss of generality that
there exists γ such that A∗γ = Bα . For notation simplicity, It is obvious that H(AΔ |Xm ) = 0. By (10)-(11), we
∗
we may further extend2 the mapping σ such that A∗σ(α) = have H(BΔ |Xm ) = 0. Hence, BΔ = A∗σ(Δ) . Since
Bα for all α ⊆ X . H(BΔ |Xm ) = 0, this implies H(A∗σ(Δ) |Xm
∗ ∗
) = 0.
c
Proposition 2: Let α ∈ Ω. Suppose Aβ satisfies the Now, notice that x ∈ Δ and x ∈ Δ. By Proposition
following properties: 4, σ(Δ) = {σ(x) : x ∈ Δ}. Therefore, y = σ(x ) ∈
1) For any γ ⊆ α, H(Aβ |Ax , x ∈ γ) = 0 if and only if σ(Δ) and y = σ(x ) ∈ σ(Δ). Together with the fact that
γ = α. H(A∗σ(Δ) |Xm ∗
) = 0, we can then prove that ym
= ym .
2) For any γ ⊆ αc , H(Aβ |Ax , x ∈ γ) = 0 if and only Next, we prove the if-statement. Suppose y, y ∈ X ∗ such

if γ = αc . that ym = ym . There exist x and x such that (26) and (27)
Then Aβ = Aα . hold. Again, define
Proof: Direct verification. Λ {y = (y1 , . . . , yM

) ∈ X ∗ : ym

= ym }, (30)
By definition of Bα and Proposition 2, we have the c
Λ {y =
(y1 , . . . , yM

) ∗
∈X :
ym = ym }. (31)
following result.
Proposition 3: Let α ∈ Ω. Then Bβ = Bα is the only Then H(A∗Λ |Xm ∗
) = 0. Let Φ {x ∈ X : σ(x) ∈ Λ}. By
binary partition variable of X ∗ such that definition and Proposition 4, BΦ = A∗σ(Φ) = A∗Λ . Hence,
∗
1) For any γ ⊆ α, H(Bβ |Bx , x ∈ γ) = 0 if and only if we have H(BΦ |Xm ) = 0 and consequently H(AΦ |Xm ) =
γ = α. 0. On the other hand, it can be verified from definition that
2) For any γ ⊆ αc , H(Bβ |Bx , x ∈ γ) = 0 if and only x ∈ Φc and x ∈ Φ. Together with that H(AΦ |Xm ) = 0, we
if γ = αc . prove that xm = xm . The proposition then follows.
Proposition 4: Let α ∈ X . Then σ(α) = δ(α) , where Proof of Theorem 3: A direct consequence of Proposition
δ(α) = {σ(x) : x ∈ α}. 5 is that there exists bijective mappings σ1 , . . . , σM such that
Proof: By Proposition 3, Bα = A∗σ(α) is the only σ(x) = (σ1 (x1 ), . . . , σM (xM )). On the other hand, Theorem
variable such that 2 proved that Pr(X = x) = Pr(X ∗ = σ(x)). Consequently,
1) For any γ ⊆ α, H(A∗σ(α) |A∗σ(x) , x ∈ γ) = 0 if and Pr(X1 = x1 , . . . , XM = xM )
only if γ = α. = Pr(X1∗ = σ1 (x1 ), . . . , XM
∗
= σM (xM )). (32)
2 Strictlyspeaking, σ(α) is not precisely defined. As γ = γ c ,
σ(α) Therefore, the joint distributions of X = (X1 , . . . , XM ) and
can either be γ or γ c . Yet, the precise choice of σ(α) does not have any
effects on the proof. However, we only require that when α is a singleton, X ∗ = (X1∗ , . . . , XM
∗
) are essentially the same (by renaming
σ(α) should also be a singleton. xm as σm (xm )).

Authorized licensed use limited to: Centrale Supelec. Downloaded on November 07,2024 at 09:50:44 UTC from IEEE Xplore. Restrictions apply.
Copyright (C) 2016 by IEICE 457

DP Math Analysis Unit Plan - Number and Alegbra (Core SL-HL)
86% (7)
DP Math Analysis Unit Plan - Number and Alegbra (Core SL-HL)
8 pages
Characterising_correlation_via_entropy_functions
No ratings yet
Characterising_correlation_via_entropy_functions
2 pages
1 Introduction To Information Theory
No ratings yet
1 Introduction To Information Theory
9 pages
Information Theory For Single-User Systems With Arbitrary Statistical Memory
No ratings yet
Information Theory For Single-User Systems With Arbitrary Statistical Memory
111 pages
Lecture 1: Entropy and Mutual Information: 2.1 Example
No ratings yet
Lecture 1: Entropy and Mutual Information: 2.1 Example
8 pages
L01
No ratings yet
L01
5 pages
LECTURE 1: Introduction
No ratings yet
LECTURE 1: Introduction
16 pages
Asymptotic Equipartition Property of Output When Rate Is Above Capacity
No ratings yet
Asymptotic Equipartition Property of Output When Rate Is Above Capacity
23 pages
Lecture9 Lossless Coding
No ratings yet
Lecture9 Lossless Coding
46 pages
dabel_info_theory
No ratings yet
dabel_info_theory
25 pages
EE 376A: Information Theory: Lecture Notes
No ratings yet
EE 376A: Information Theory: Lecture Notes
75 pages
Lecture 8: Channel Capacity, Continuous Random Variables: 1.1 Examples
No ratings yet
Lecture 8: Channel Capacity, Continuous Random Variables: 1.1 Examples
6 pages
Inf Theory 3
No ratings yet
Inf Theory 3
76 pages
Beyond The Cut-Set Bound: Uncertainty Computations in Network Coding With Correlated Sources
No ratings yet
Beyond The Cut-Set Bound: Uncertainty Computations in Network Coding With Correlated Sources
12 pages
On The Entropy Sum
No ratings yet
On The Entropy Sum
5 pages
Stochastic Processes, Detection and Estimation: 6.432 Course Notes
No ratings yet
Stochastic Processes, Detection and Estimation: 6.432 Course Notes
52 pages
Notes
No ratings yet
Notes
32 pages
Information Theory
No ratings yet
Information Theory
122 pages
Information Theory and Coding
No ratings yet
Information Theory and Coding
79 pages
Analysing Causal Structures With Entropy: Department of Mathematics, University of York, Heslington, York, YO10 5DD, UK
No ratings yet
Analysing Causal Structures With Entropy: Department of Mathematics, University of York, Heslington, York, YO10 5DD, UK
25 pages
Capacity Bounds for Networks With Correlated Sources and Characterisation of Distributions by Entropies
No ratings yet
Capacity Bounds for Networks With Correlated Sources and Characterisation of Distributions by Entropies
14 pages
Elements of Information Theory 2006 Thomas M. Cover and Joy A. Thomas
No ratings yet
Elements of Information Theory 2006 Thomas M. Cover and Joy A. Thomas
16 pages
A Brief Introduction To Polar Codes: Supplemental Material For Advanced Channel Coding Henry D. Pfister April 21st, 2014
No ratings yet
A Brief Introduction To Polar Codes: Supplemental Material For Advanced Channel Coding Henry D. Pfister April 21st, 2014
12 pages
Lecture 3 4
No ratings yet
Lecture 3 4
5 pages
Compact representation of polymatroid axioms for random variables with conditional independencies
No ratings yet
Compact representation of polymatroid axioms for random variables with conditional independencies
5 pages
IT-CO-1-EN
No ratings yet
IT-CO-1-EN
26 pages
Information Theory 2
No ratings yet
Information Theory 2
41 pages
Entropie Eng PDF
No ratings yet
Entropie Eng PDF
6 pages
L04
No ratings yet
L04
4 pages
Information Theory Lecture Notes
No ratings yet
Information Theory Lecture Notes
97 pages
Lecture Notes in Information Theory Volume II
No ratings yet
Lecture Notes in Information Theory Volume II
293 pages
DC M3
No ratings yet
DC M3
14 pages
MA8451 Notes 008 Edubuzz360
No ratings yet
MA8451 Notes 008 Edubuzz360
125 pages
Probabilistic Stochastic Graphical Models With Improved Techniques
No ratings yet
Probabilistic Stochastic Graphical Models With Improved Techniques
14 pages
lect02
No ratings yet
lect02
20 pages
MA8451 RP - by EasyEngineering - Net 4
No ratings yet
MA8451 RP - by EasyEngineering - Net 4
125 pages
ITC AKASH Full End Sem
No ratings yet
ITC AKASH Full End Sem
36 pages
ITC Module2 1
No ratings yet
ITC Module2 1
34 pages
Elements of Information Theory-Chapter1-2
No ratings yet
Elements of Information Theory-Chapter1-2
63 pages
Shannon's Theorems: Math and Science Summer Program 2020
No ratings yet
Shannon's Theorems: Math and Science Summer Program 2020
28 pages
mutual_info_boolean_functions_AGKN2013
No ratings yet
mutual_info_boolean_functions_AGKN2013
7 pages
Notes It
No ratings yet
Notes It
46 pages
Problem Set 2
No ratings yet
Problem Set 2
4 pages
Entropy Coding
No ratings yet
Entropy Coding
2 pages
Electrical Engineering 229A Lecture Notes Information Theory and Coding
No ratings yet
Electrical Engineering 229A Lecture Notes Information Theory and Coding
117 pages
Learning Material - ITC
No ratings yet
Learning Material - ITC
96 pages
chapter16
No ratings yet
chapter16
71 pages
Probab Refresh
No ratings yet
Probab Refresh
7 pages
Entropy 4
No ratings yet
Entropy 4
10 pages
Lecture 2: Entropy and Mutual Information: 2.1 Example
No ratings yet
Lecture 2: Entropy and Mutual Information: 2.1 Example
8 pages
Information Theory and Coding: Universit' A Degli Studi Di Siena Facolt'a Di Ingegneria
No ratings yet
Information Theory and Coding: Universit' A Degli Studi Di Siena Facolt'a Di Ingegneria
156 pages
The Binary Entropy Function: ECE 7680 Lecture 2 - Definitions and Basic Facts
No ratings yet
The Binary Entropy Function: ECE 7680 Lecture 2 - Definitions and Basic Facts
8 pages
Bayesian Networks: Machine Learning, Lecture (Jaakkola)
No ratings yet
Bayesian Networks: Machine Learning, Lecture (Jaakkola)
8 pages
Lec7 InformationTheory
No ratings yet
Lec7 InformationTheory
41 pages
TELE9754 L1-ProbTheory
No ratings yet
TELE9754 L1-ProbTheory
33 pages
Introduction to Differentiable Manifolds
From Everand
Introduction to Differentiable Manifolds
Louis Auslander
4.5/5 (2)
Differential Forms
From Everand
Differential Forms
Henri Cartan
5/5 (2)
Theory of Approximation
From Everand
Theory of Approximation
N. I. Achieser
No ratings yet
Elgenfunction Expansions Associated with Second Order Differential Equations
From Everand
Elgenfunction Expansions Associated with Second Order Differential Equations
E. C. Titchmarsh
No ratings yet
Mathematical Foundations of Information Theory
From Everand
Mathematical Foundations of Information Theory
A. Ya. Khinchin
3.5/5 (9)
Introduction to Bessel Functions
From Everand
Introduction to Bessel Functions
Frank Bowman
2.5/5 (1)
Dan - Number Sense
No ratings yet
Dan - Number Sense
6 pages
Discrete Structures
No ratings yet
Discrete Structures
350 pages
Lecture Notes
No ratings yet
Lecture Notes
15 pages
Discrete Mathematics: Propositional Equivalence
No ratings yet
Discrete Mathematics: Propositional Equivalence
36 pages
Passive Imaging with Ambient Noise 1st Edition Josselin Garnier - The latest ebook is available for instant download now
100% (2)
Passive Imaging with Ambient Noise 1st Edition Josselin Garnier - The latest ebook is available for instant download now
49 pages
Tok Math
No ratings yet
Tok Math
1 page
Riemann Integral PDF
No ratings yet
Riemann Integral PDF
4 pages
Maths MCQs Class-8
No ratings yet
Maths MCQs Class-8
24 pages
Language Theory and Automata
No ratings yet
Language Theory and Automata
13 pages
Essay On Geometry
100% (1)
Essay On Geometry
42 pages
On The Local-Global Divisibility Over Abelian Varieties: Florence Gillibert, Gabriele Ranieri
No ratings yet
On The Local-Global Divisibility Over Abelian Varieties: Florence Gillibert, Gabriele Ranieri
33 pages
MATH1081 Discrete Mathematics 3.1: Title
No ratings yet
MATH1081 Discrete Mathematics 3.1: Title
74 pages
Full download The Foundations of Analysis A Straightforward Introduction Book 1 Logic Sets and Numbers 1, 2008 digital reissue Edition K.G. Binmore pdf docx
100% (4)
Full download The Foundations of Analysis A Straightforward Introduction Book 1 Logic Sets and Numbers 1, 2008 digital reissue Edition K.G. Binmore pdf docx
61 pages
Math150 Syllabus
No ratings yet
Math150 Syllabus
9 pages
Direct Proofs and Counter Examples
No ratings yet
Direct Proofs and Counter Examples
14 pages
Boolean Algebra and Circuit
No ratings yet
Boolean Algebra and Circuit
22 pages
The Use of Tagged Partitions in Elementary Real Analysis: The American Mathematical Monthly
No ratings yet
The Use of Tagged Partitions in Elementary Real Analysis: The American Mathematical Monthly
12 pages
Cp7201 Tfoc Model QP 1 Theoretical Foundations of Computer Science
No ratings yet
Cp7201 Tfoc Model QP 1 Theoretical Foundations of Computer Science
2 pages
Cosmic
No ratings yet
Cosmic
10 pages
Ge 7 MMW Module 1 Final
No ratings yet
Ge 7 MMW Module 1 Final
22 pages
ERRATA TO "REAL ANALYSIS," 2nd Edition (6th and Later Printings) G. B. Folland
No ratings yet
ERRATA TO "REAL ANALYSIS," 2nd Edition (6th and Later Printings) G. B. Folland
4 pages
Describing Mathematical System.pptx 2024 - 1
No ratings yet
Describing Mathematical System.pptx 2024 - 1
47 pages
Lecture Notes For 2021-01-28
No ratings yet
Lecture Notes For 2021-01-28
2 pages
Representation and Invariance of Scientific Structures 1st Edition Patrick Suppes download pdf
100% (11)
Representation and Invariance of Scientific Structures 1st Edition Patrick Suppes download pdf
85 pages
Mathematicians
No ratings yet
Mathematicians
555 pages
Self-Reference and Computational Complexity Chris Langan
No ratings yet
Self-Reference and Computational Complexity Chris Langan
16 pages
Discrete Mathematics Syllabi
No ratings yet
Discrete Mathematics Syllabi
1 page
Where Can Buy Core Questions in Philosophy 4th Edition Elliott Sober Ebook With Cheap Price
100% (5)
Where Can Buy Core Questions in Philosophy 4th Edition Elliott Sober Ebook With Cheap Price
63 pages
Indiscrete Thoughts 1st Edition Gian-Carlo Rota (Auth.) - Download the ebook now for instant access to all chapters
No ratings yet
Indiscrete Thoughts 1st Edition Gian-Carlo Rota (Auth.) - Download the ebook now for instant access to all chapters
57 pages

Characterising Probability Distributions via Entropies

Uploaded by

Characterising Probability Distributions via Entropies

Uploaded by

ISITA2016, Monterey, California, USA, October 30-November 2, 2016

Characterising Probability Distributions via

TABLE I such that for any subset Δ of Ω and τ ⊆ {1, . . . , M },

X2∗ Pr(X = (x1 , . . . , xM ))

B. Random Vector Case h(Xs , s ∈ W, Bα , α ∈ Δ)

Let X ∗ be the support of X ∗ . We similarly deﬁne Ω∗ as the σ(x) = y = (y1 , . . . , yM ) (26)

You might also like

B. Random Vector Case h(Xs , s ∈ W, Bα , α ∈ Δ)