Compact representation of polymatroid axioms for random variables with conditional independencies
Compact representation of polymatroid axioms for random variables with conditional independencies
Abstract—The polymatroid axioms are dominantly used to there may not exists a Markov structure.
study the capacity limits of various communication systems. In most general setting, there may be a set of random
In fact for most of the communication systems, for which the variables with functional dependence and conditional indepen-
capacity is known, these axioms are solely required to obtain
the characterization of capacity. Moreover, the polymatroid dence constraints such that it may not be possible to represent
axioms are stronger tools to tackle the implication problem them as any graphical model (FDG, Bayesian or Markov). In
for conditional independencies compared to the axioms used in this work, we give a compact representation of polymatroid
Bayesian networks. However, their use is prohibitively complex axioms for this most general case. This compact formulation
as the number of random variables increases since the number of can have potentially many applications. For example, it can
inequalities to consider increases exponentially. In this paper we
give a compact characterization of the minimal set of polymatroid be used to find efficiently those conditional independence
axioms when arbitrary conditional independence and functional implications which may not be feasible to find by employing
dependence constraints are given. In particular, we identify those the set of axioms [6] used in Bayesian networks (see [2,
elemental equalities which are implied by given constraints. We Section 14.5] for details). The compact representation also
also identify those elemental inequalities which are redundant enables proving basic information inequalities with arbitrary
given the constraints.
conditional independence constraints faster1 . Moreover, it can
be used for faster computation of the LP bound [2] for
I. I NTRODUCTION
communication scenarios involving random variables with
In [1], we considered complexity reduction of the LP causal dependencies and conditional independencies.
bound via simplified characterization of elemental inequalities The paper is organized as follows. In Section II, we formally
when network coding and source independence constraints describe the problem. In Section III we present the main results
are given for an instance of network coding model. We also of the paper. Algorithms to generate compact presentation of
gave novel algorithms which directly generate the simplified polymatroid axioms are given in Section IV. In Section V, we
characterization. The results developed are also applicable for discuss the reduction in polymatroid inequalities. In Section
computational complexity reduction for proving information VI, we show an application of the main results to obtain
inequalities using Information Theoretic Inequality Prover [2] compact characterization of polymatroid axioms for random
(ITIP) while functional dependence and independence con- variables with Markov structures.
straints are given for a set of random variables (in general,
II. P ROBLEM F ORMULATION
polymatroidal pseudo-variables). The motivation was that the
network coding and source independence constraints for an Let V = {A, B, . . . } be a finite set and P(V) be its power
instance of network coding model can be exploited to construct set (i.e., the set of all subsets). A rank function
Functional Dependence Graphs (FDGs) which in turn be used h : P(V) 7→ R
to find irreducible sets [3] (equalities of joint entropies).
But in a general communication scenario, the random vari- is simply defined as a real-valued function defined on P(V)
ables may have causal dependence relationship rather than such that h(∅) = 0. If h is known from the context, we will
functional dependence relationship. Such a system of random often denote h(A) by H(A).
variables can be modeled as Bayesian networks (directed A rank function h is called entropic if the elements in V
ayclic graphs) [4]. Moreover, if the random variables have are random variables such that h(A) is the joint entropy of
cyclic causal dependency (e.g. feedback) then the Bayesian the set of random variables in A.
network modeling is no longer applicable. Such a system of It is sometimes instrumental to treat h as a column vector
random variables can be modeled either as a Markov chain (or a point) in a 2|V| -dimensional Euclidean space, such that
or a Markov random fields (undirected graph) collectively (1) the axes of the space are labelled by elements of P(V),
regarded as Markov structures [2]. In fact, a Markov structure and (2) the “coordinate” of the point or the vector with respect
is a collection of special conditional independencies called to the A-axis is given by h(A).
full conditional mutual independency. But, the converse is not 1 Here arbitrary means any set of conditional independence constraints
true. That is, for given collection of conditional independencies which may not be consistent with any graphical model.
Let Γ∗ [V] (or simply Γ∗ ) be the set of all entropy functions. As all entropic functions are polymatroids, (1)-(3) can be
One of the most fundamental and important problem in regarded as information inequalities. In fact, (2) corresponds to
information theory is to characterize Γ∗ . Unfortunately, this is the nonnegativity of conditional entropies and (3) correspond
an extremely difficult problem. Answers to this question are to the nonnegative of conditional mutual information. These
only known when the size of V is less than four. However, an information inequalities are the “basic laws of information”.
outer bound for Γ∗ [V] is available. One of the most common They are of critical importance in proving converse of coding
outer bound of Γ∗ [V] is the set of polymatroidal functions, theorems. In fact, for most communication problems for which
denoted by Γ[V] (or simply Γ). the capacity is known, only the basic inequalities are used to
Definition 1 (Polymatroids): A rank function h is a poly- derive the capacity.
matroid (or equivalently is in Γ[V]) if Another application of Γ[V] is to derive outer-bound for
the set of achievable throughputs in a network. Without going
h(∅) = 0, (1) through the details, it can be proved that the throughput of
h(A) ≥ h(B), ∀B ⊆ A (2) a network can be bounded by solving a linear programming
h(A) + h(B) ≥ h(A ∪ B) + h(A ∩ B), ∀A, B ⊆ V. (3) problem in the following form:
268
Authorized licensed use limited to: Centrale Supelec. Downloaded on November 07,2024 at 09:50:44 UTC from IEEE Xplore. Restrictions apply.
2012 IEEE Information Theory Workshop
I(A; B|C) = 0 for some real number c and nonnegative numbers cj for j ∈
I(A; B|CD) ≥ 0 {1, . . . , |∆|}. Consequently,
I(B; D|C) ≥ 0
X
I(U ; V |W) + ci δj = cI(A; B|Ci ). (18)
implies j∈{1,...,|∆|}
269
Authorized licensed use limited to: Centrale Supelec. Downloaded on November 07,2024 at 09:50:44 UTC from IEEE Xplore. Restrictions apply.
2012 IEEE Information Theory Workshop
1) Either U ∈ A and V ∈ B or U ∈ B and V ∈ A, functional dependence and conditional independence for set
2) W \ ((A ∪ B) \ {U, V }) ⊆ C. In other words, there exists of random variables V. Specifically, set K does not return
D ⊆ (A ∪ B) \ {U, V } such that W = C ∪ D. those elemental inequalities which are proved to be redundant
Consequently, the equality I(U ; V |W) = 0 belongs to I, and in Theorem 1. Algorithm 1, Decompose(J, V) is used as a
the lemma is proved. subroutine (function) in the algorithm.
Using the same approach, we also have the following
lemma. Algorithm 2 ReducedAxioms(J , V)
Lemma 4 (Deriving new equalities (2)): The set of ele- Require: J = {J1 , ..., Jn }, V
mental inequalities ∆, together with the equality K←∅
for all J ∈ J do
J = H(A|C) = 0 I ← Decompose(J, V)
maximally implies the following set of elemental equalities: K ←K∪I
end for
A ∈ A,
( )
H(A|V \ {A}) = 0 for all A ∈ V do
I= : B ∈ V \ {A} ∪ C, if {H(A|V \ A) = 0} 6∈ K then
I(A; B|DC) = 0
D ⊆ V \ {A, B} ∪ C K ← K ∪ {H(A|V \ A) ≥ 0}
The maximality of implied elemental equalities in Lemmas end if
3 and 4 ensures the maximal reduction in elemental inequal- end for
ities by replacing them with elemental equalities when an for all A, B ∈ V do
equality of conditional independence or functional dependence for C = ∅ to V \ {A, B} : C ⊆ V \ {A, B} do
form is given. In practice, replacing linear inequalities by if
linear equalities is advantageous since solving linear inequal- {I(A; B|C) = 0} 6∈ K
ities is computationally much expensive compared to linear
and
equalities.
{I(A; D|C \ {D}) = 0},
IV. A LGORITHMS 6 ∃D ∈ C : {I(A; D|C) ≥ (or =)0}, ∈K
Algorithm 1, Decompose(J, V) returns a set of maximal {I(A; B|C \ {D}) ≥ (or =)0}
elemental identities I for an input identity J of the form of
and
functional dependence or conditional independence for set of
random variables V. The algorithm uses Lemmas 3 and 4. {I(B; D|C \ {D}) = 0},
6 ∃D ∈ C : {I(B; D|C) ≥ (or =)0}, ∈K
Algorithm 1 Decompose(J, V) {I(A; B|C \ {D}) ≥ (or =)0}
Require: J, V then
I←∅ K ← K ∪ {I(A; B|C) ≥ 0}
if J = {H(A|C) = 0} then end if
for all A ∈ A do end for
I ← I ∪ {H(A|V \ {A}) = 0} end for
for all B ∈ V \ {A} ∪ C do Return K
for all D ⊆ V \ {A, B} ∪ C do
I ← I ∪ {I(A; B|DC) = 0}
V. E LEMENTAL I NEQUALITY R EDUCTION
end for
end for Using the results developed in Section III, a significant
end for reduction in polymatroid axioms can be achieved. To give an
Return I idea we give the following lemma.
else if J = {I(A; B|C) = 0} then Lemma 5 (Reduction): Let J = {I(A; B|C) = 0} be a
for all A ∈ A, B ∈ B do given information identity for disjoint sets of random variables
for all D ⊆ A ∪ B \ {A, B} do A, B, C ⊂ V, |V| = n. Then, there are
I ← I ∪ {I(A; B|DC) = 0} |A||B|2|A|+|B|−2 (19)
end for
end for many maximally implied elemental equalities and at least
Return I |A|−1
X |A| − 1
end if |A||B| (|A| − 1 − i + |V \ A ∪ B ∪ C|)
i=0
i
Algorithm 2, ReducedAxioms(J , V) returns a compact set |B|−1
!
X |B| − 1
K of polymatroid axioms (elemental identities and elemental + (|B| − 1 − i + |V \ A ∪ B ∪ C|) .
i
inequalities) for input set of identities J of the form of i=0
270
Authorized licensed use limited to: Centrale Supelec. Downloaded on November 07,2024 at 09:50:44 UTC from IEEE Xplore. Restrictions apply.
2012 IEEE Information Theory Workshop
271
Authorized licensed use limited to: Centrale Supelec. Downloaded on November 07,2024 at 09:50:44 UTC from IEEE Xplore. Restrictions apply.