Mutual Inf Between Initial and Final State
Mutual Inf Between Initial and Final State
Annals of Physics
journal homepage: www.elsevier.com/locate/aop
article info a b s t r a c t
1. Introduction
Quantum information is primarily understood in terms of von Neumann entropy and re-
lated quantities [1,2]. Due to inherently quantum phenomena such as entanglement, quantum
information measures—such as conditional von Neumann entropy and mutual von Neumann
information—lack well-defined underlying probability distributions. Nevertheless, despite their own
somewhat unclear conceptual underpinnings, these quantities have proved useful for reframing
and clarifying aspects of quantum information. Many of the relationships satisfied by classical
information measures are mirrored by their quantum analogues [1–3], sometimes quite remarkably,
as in the case of strong subadditivity [4].
In this paper, we define and study new forms of quantum information that complement the stan-
dard quantities. The key ingredients in our approach are conditional probability distributions, first
studied in [5,6], that provide an underlying picture for the type of information being described. In
∗ Corresponding author.
E-mail addresses: [email protected] (J.A. Barandes), [email protected] (D. Kagan).
https://doi.org/10.1016/j.aop.2022.169192
0003-4916/© 2022 Elsevier Inc. All rights reserved.
J.A. Barandes and D. Kagan Annals of Physics 448 (2023) 169192
particular, we are able to provide a description of information flow in the context of open quantum
systems whose dynamical evolution is well-approximated by linear, completely positive, trace-
preserving (CPTP) maps, without any explicit appeal to larger Hilbert spaces or ancillary systems.
We show that some standard results of quantum information theory emerge quite naturally from
our perspective.
Section 2 provides some relevant background on classical and quantum information. In Section 3,
we define new forms of quantum conditional entropy and quantum mutual information in terms
of quantum conditional probabilities, and briefly describe a dynamical interpretation of these
quantities. In Section 4, we use the results of the previous section to analyze processes under which
there is growth in entropy (in the sense of Shannon) and to provide new proofs of the concavity
of von Neumann entropy and quantum data processing. We demonstrate that our quantum data-
processing inequality provides a natural interpretation of Holevo’s theorem in a dynamical context,
showing that Holevo’s χ acts as an upper bound on the amount of information that can flow from a
system’s initial configuration to a later one. In Section 5, we present a discussion of various ways to
generalize our constructions, including to an analysis of the relationships between subsystems and
the parent systems to which they belong, and to more general decompositions of density matrices
than the ones that play a primary role in the paper. In Section 6, we identify connections between
the constructions in this paper and previous work. We conclude in Section 7 with a brief summary
of our results and interesting open questions.
2. Background
Consider a classical random variable X whose set of outcomes {x}x occur according to a proba-
bility distribution {p(x)}x . Using this data, we can compute expectation values, standard deviations,
and so on. Assuming a discrete set of outcomes, the average information encoded in the probability
distribution is given by its Shannon entropy:
∑
H(X ) ≡ − p(x) log p(x). (1)
x
The simplest kind of density matrix corresponds to a pure state, and can be expressed as a
projection operator of the form |Ψ ⟩⟨Ψ |. In this simple case, the formula (2) reduces to
In general, a density matrix has infinitely many possible decompositions over sets of projectors
{Π̂α }α ,
∑
ρ̂ = λα Π̂α , Π̂α = |φα ⟩⟨φα |, (4)
α
where the set {λα }α consists of non-negative real numbers that sum to unity, and where {|φα ⟩}α is
not necessarily an orthonormal set of states. Each such decomposition has a corresponding Shannon
entropy:
∑
H ({λα }) = − λα log λα . (5)
α
2
J.A. Barandes and D. Kagan Annals of Physics 448 (2023) 169192
The decomposition that minimizes [7] the Shannon entropy consists of the eigenvalues and corre-
sponding eigenprojectors of ρ̂ ,
∑
ρ̂ = pi P̂i , P̂i = |Ψi ⟩⟨Ψi |, (6)
i
where {|Ψi ⟩}i is the set of eigenstates of ρ̂ . The von Neumann entropy of a density matrix ρ̂ is this
minimal Shannon entropy of ρ̂ ,
∑
S(ρ̂ ) ≡ −Tr[ρ̂ log ρ̂] = − pi log pi , (7)
i
and therefore represents the minimum amount of average information that can be encoded in a
system described by ρ̂ .
Classically, the conditional entropy of a random variable Y given another random variable X is
defined in terms of a conditional probability distribution p(y|x) that describes correlations between
possible outcomes of the two random variables Y and X . Specifically, the conditional entropy of a
random variable Y given that X takes the value x is defined to be
∑
H(Y |x) ≡ − p(y|x) log[p(y|x)]. (8)
y
which can be thought of as the average information encoded in Y given a particular outcome of X ,
averaged over all the possible outcomes of X .
Conditional entropies satisfy the identity
H(Y |X ) = H(Y , X ) − H(X ), (10)
where H(Y , X ) is the Shannon entropy of the joint distribution in X and Y . The identity (10)
captures the intuition that the conditional entropy measures the information about Y encoded in
its correlations with X in excess of information encoded in X alone.
In the quantum case, the pair of random variables X and Y are replaced by a bipartite quantum
system AB, with a corresponding density matrix ρ̂AB . The standard definition of conditional von
Neumann entropy adopts the form of the classical relation (10), with S(ρ̂AB ) in place of the classical
joint entropy and S(ρ̂B ) substituted for H(X ), where ρ̂B is the reduced density matrix for subsystem
B, as defined by the partial trace over subsystem A. That is, the conditional von Neumann entropy
is given by
S(A|B) ≡ S(ρ̂AB ) − S(ρ̂B ), ρ̂B = TrA [ρ̂AB ]. (11)
Unlike classical conditional entropy, conditional von Neumann entropy defined by (11) lacks an
underlying probability distribution, as can be seen from the fact that S(A|B) can be negative [1] when
subsystems A and B are entangled. In [8], the authors introduce a conditional amplitude operator
ρ̂A|B as one possible generalization of a conditional probability distribution, but the operator is not a
density matrix, and thus lacks a clear interpretation itself. Operational approaches are quite fruitful
(see [9] for example), but they do not always clarify the conceptual underpinnings of such quantities.
The type of information measures studied in this paper are built from quantum conditional prob-
abilities first explored in the context of the minimal modal interpretation of quantum theory [5,6].
3
J.A. Barandes and D. Kagan Annals of Physics 448 (2023) 169192
While the quantities we discuss here require nothing beyond standard quantum theory for their
formulation, we adopt the language of the minimal modal interpretation, as it provides a useful
way to describe what follows.
To start, imagine that at a given time, a quantum system is described by an ‘objective’ density
matrix ρ̂Q —objective in the sense that it is empirically optimal among all possible density matrices
that an external observer could assign to the system.1 Now suppose that from the initial time to
a later time, the density matrix evolves from ρ̂Q to a final density matrix ρ̂R according to a linear
CPTP map ER←Q :
According to the minimal modal interpretation, every quantum system has an actual underlying
state corresponding to one of the eigenstates of the system’s density matrix, but that actual under-
lying state is hidden from external observers unless the system’s density matrix is a projector. In our
present example, the system’s actual underlying state evolves from being one of the eigenstates of
ρ̂Q to being one of the eigenstates of ρ̂R . Collectively, the eigenstates of ρ̂Q represent the initial
possible underlying states of the system, and the eigenstates of ρ̂R represent the final possible
underlying states.
The evolution of the possible underlying states of the system is defined stochastically in terms
of quantum conditional probabilities. For example, the probability that the system’s later state is
|Ψr ⟩ given that it was initially |Ψq ⟩ is defined to be
pE (r |q) ≡ Tr P̂ r ER←Q {P̂q } = ⟨Ψr |ER←Q {P̂q }|Ψ r ⟩.
[ ]
(15)
Note that throughout this paper, lower-case index labels q, q , . . . and r , r , . . . on states correspond
′ ′
1 Specifically, by an objective density matrix, we mean a density matrix whose mixedness arises entirely from
entanglement to other systems and is therefore a solely improper mixture. In particular, we do not include any classical
uncertainty. For a system not entangled with its environment, the objective density matrix is a rank-one projector
representing a pure state. For a system entangled with its environment, the von Neumann entropy for the objective
density matrix is precisely equal to the entanglement entropy. A physically realistic observer cannot improve on a system’s
objective density matrix without physically affecting the system by introducing new forms of entanglement.
4
J.A. Barandes and D. Kagan Annals of Physics 448 (2023) 169192
allowing us to arrive at (16) by identifying the trace in (19) as the quantum conditional probability
(15).
The quantum conditional probabilities pE (r |q) can be associated with a formal density matrix
∑
ρ̂RE|q ≡ pE (r |q)P̂r , (20)
r
which satisfies
∑
ρ̂R = pq ρ̂RE|q , (21)
q
due to (16).
A crucial difference between classical and quantum conditional probabilities is that the latter fail
to satisfy Bayes’ theorem:
The failure of Bayes’ theorem reflects the non-commutativity of quantum observables, and therefore
the inability to define a symmetric joint probability distribution. From a dynamical perspective,
Bayes’ theorem fails due to the generic irreversibility of ER←Q , as is evident from the case in which
ER←Q represents a projective measurement.2
In general, linear CPTP evolution of an eigenprojector of the initial density matrix yields a
nontrivial density matrix defined by
Introducing a new label rq to distinguish the eigenprojectors {P̂rq }rq of this density matrix, we can
write down its spectral decomposition:
∑
ρ̂qR = pE (rq |q)P̂rq . (24)
rq
Note that for each fixed value of q, the basis of eigenprojectors {P̂rq }rq can be different, and will
generically differ from {P̂r }r .
Nevertheless, the set of these density matrices must combine to yield ρ̂R ,
∑
ρ̂R = pq ρ̂qR , (25)
q
as a consequence of (23).
The relations (21) and (25) suggest that ρ̂RE|q and ρ̂qR are themselves related. To see how, notice
that the quantum conditional probabilities pE (r |q) can be expressed as
where in passing from the first to the second line we have used the decomposition (24). The quantity
inside the trace has the form of a Born probability,
2 The paper by Schack, Brun, and Caves [10] is a prominent example of work that does indeed derive a quantum
version of Bayes’ rule. However, these sorts of results rely on taking a large number of copies of a system’s Hilbert space
to represent a large ensemble of identical systems. The conditional probabilities we define in (15) differ in essential
ways from these earlier constructions, as is apparent from the fact that our conditional probabilities involve only a single
instance of a system’s Hilbert space. Thus, the failure of Bayes’ theorem is compatible with these prior results.
5
J.A. Barandes and D. Kagan Annals of Physics 448 (2023) 169192
Note that
S(ρ̂RE|q ) ≥ S(ρ̂qR ), (32)
which follows from the double stochasticity of the Born probability distribution β (r |rq ).3
So far, our description of the quantum conditional probabilities (15) has been dynamical, with
ER←Q thought of as an evolution map. However, the same ideas can be applied to the quantum
relationships between systems and their subsystems by noting that partial traces are an example
of a linear CPTP map. We provide a more detailed sketch of these ideas in Section 5. In what follows,
we will continue to focus on the dynamical picture, in which a single system evolves according to
ER←Q .
Combining the quantum conditional probabilities of (15) with Shannon’s entropy formula yields
a new type of quantum conditional entropy. Using the initial and final density matrices defined in
(13) and (14), respectively, we let
∑
JE (R|q) ≡ − pE (r |q) log[pE (r |q)] = S(ρ̂RE|q ) (33)
r
be the quantum conditional entropy of our system given that the system’s initial underlying state
corresponded to the eigenstate |Ψq ⟩ of ρ̂Q . We will argue that we can interpret this quantity as the
entropy added to the system during its evolution given the initial underlying state of the system.
The full quantum conditional entropy is the average over all possible initial eigenstates of ρ̂Q :
∑ ∑
JE (R|Q ) ≡ JE (R|q)pq = − pE (r |q)pq log[pE (r |q)]. (34)
q q ,r
3 We discuss doubly stochastic probability distributions in the appendix, providing an explicit proof of a generalization
of (32).
6
J.A. Barandes and D. Kagan Annals of Physics 448 (2023) 169192
The relation
IE (R : Q ) = S(ρ̂R ) − JE (R|Q ) (36)
follows directly from the definitions of quantum conditional entropy (34) and quantum mutual
information (35), mirroring the classical identity
I(Y : X ) = H(Y ) − H(Y |X ). (37)
In a dynamical context, mutual information can be thought of as measuring the information that is
shared between the initial and final system configurations.
The new forms of quantum conditional entropy and quantum mutual information defined in
(33) and (35), respectively, are distinct from the traditional quantities found in the literature. As
discussed in Section 2, the traditional conditional von Neumann entropy S(A|B) in Eq. (11) is not
defined in terms of an underlying probability distribution. The traditional von Neumann mutual
information I VN (A : B) shared by subsystems A and B is defined as
I VN (A : B) ≡ S(ρ̂A ) − S(A|B). (38)
Once again, there need not be any underlying probability distribution in these traditional definitions.
We will show that the new information measures developed in this paper satisfy inequalities that
are analogous to those satisfied by (11) and (38). However, the existence of underlying quantum
conditional probabilities (15) provides conceptually clearer interpretations of the sort of information
measured by these new quantities.
In words, the increase in the system’s entropy arises solely from the evolution of the system. We
can also characterize this statement in terms of the mutual information, which vanishes,
IE (R; Ψ ) = S(ρ̂R ) − JE (R|Ψ ) = 0, (42)
thereby showing that no information is carried over from the system’s initial state to its final
configuration.
This linear CPTP map can be thought of as modeling a process in which the system becomes
more entangled with its surrounding environment.4 From this perspective, the quantum conditional
entropy measures the growth of entanglement between a system and its environment.
4 This interpretation assumes that the map is faithful to the underlying physics, rather than capturing measurement
or modeling errors.
7
J.A. Barandes and D. Kagan Annals of Physics 448 (2023) 169192
where I is the identity. Under such evolution, the eigenvalues of ρ̂Q are unchanged and the
eigenstates rotate into the set of eigenstates of ρ̂R ,
Due to the existence of an underlying probability distribution, the quantum conditional entropy
(33) and mutual information (35) satisfy various relationships familiar from classical information
theory.
• A system’s conditional entropy cannot be greater than the system’s final entropy:
JE (R|Q ) ≤ S(ρ̂R ). (48)
The inequalities (46), (47), and (48) can be proved following similar steps to those from classical
information theory. We provide details in the appendix.
refrain from learning the outcome, then the post-measurement density matrix is well-approximated
by
{ } ∑
ρ̂M = M ρ̂Q = P̂m ρ̂Q P̂m , (52)
m
which is clearly unital. As a result, we see that measurements without post-selection increase the
entropy of a system.
The quantities described earlier allow us to demonstrate certain standard properties of quantum
information. Consider the concavity of von Neumann entropy,
∑ ∑
pi S(ρ̂i ) ≤ S(ρ̂ ), ρ̂ = pi ρ̂i , (53)
i i
where ρ̂ is an arbitrary density matrix, and the set of pairs (pi , ρ̂i ) i is any collection of non-
{ }
negative weights and density matrices that form a decomposition of ρ̂ with the weights summing
to unity. Note that the number of elements in the set can exceed the dimension of the Hilbert space.
{ To prove (53), we let ρ̂ = ρ̂R . Given a decomposition into a set of weights and density matrices
(pi , ρ̂i ) i we can define a linear CPTP map E and a density matrix ρ̂Q such that ρ̂R = E {ρ̂Q } such
}
that the elements of the decomposition arise from E applied to the eigen-decomposition of ρ̂Q , with
the identification of the i and q indices.5 From the relations (21), (25), and (31), we have,
∑ ∑
ρ̂R = pq ρ̂R|q = pq ρ̂qR , (54)
q q
with
∑
ρ̂R|q = P̂r ρ̂qR P̂r . (55)
r
Thus,
∑ ∑
pq S(ρ̂qR ) ≤ pq S(ρ̂R|q ) ≤ S(ρ̂R ),
q q
where the first inequality follows from (32), while the second is the inequality (48), demonstrating
the concavity of von Neumann entropy.
Consider a system that evolves from ρ̂Q to ρ̂R , and then to ρ̂S , as described by the linear CPTP
maps ER←Q and ES ←R , so that we have
5 Note that we implicitly allow E to involve a partial trace operation so that the Hilbert space dimension associated
with the final density matrix ρ̂R can be smaller than that of ρ̂Q .
9
J.A. Barandes and D. Kagan Annals of Physics 448 (2023) 169192
,
[ { }]
p(r |q) = Tr P̂r ER←Q P̂q (58)
where we suppress the map label as the mapping will be clear from the state indices.
Similarly, we have
{∑ } ∑
ρ̂S = ES ←R pr ES ←R P̂r , ,
{ } [ { }]
pr P̂r = p(s|r) = Tr P̂s ES ←R P̂r (59)
r r
as well as
{∑ } ∑
ρ̂S = ES ←Q pq ES ←Q P̂q , .
{ } [ { }]
pq P̂q = p(s|q) = Tr P̂s ES ←Q P̂q (60)
q r
There are some subtle constraints required for the consistency of these processes. Using the law
of total probability and (28), we have
∑
ps = p(s|r)pr
r
∑
= p(s|r)p(r |q)pq .
r ,q
∑
= p(s|r)β (r |rq )p(rq |q)pq . (61)
r ,q,rq
Similarly, we have
∑
ps = p(s|q)pq . (62)
q
Comparing (61) and (64), we find that a natural-looking consistency condition to impose would be
∑
p(s|rq ) = p(s|r)β (r |rq ). (65)
r
We therefore restrict our maps ER←Q and ES ←R to those satisfying (65). The existence of such maps
can be demonstrated by expanding out the definitions of the conditional probabilities in (65) on
10
J.A. Barandes and D. Kagan Annals of Physics 448 (2023) 169192
Conceptually, this projective measurement ensures that the intermediate composite state of the
system and its environment re-factorize, thus leading to Markov-like evolution.6 Putting all this
together, we have
{ }
p(s|q) ≡ Tr[P̂s ES ←Q P̂q ]
{ }
= Tr[P̂s ES ←R ◦ ER←Q P̂q ]
∑ { }
= Tr[P̂s ES ←R P̂rq ]p(rq |q)
rq
∑
= p(s|rq )p(rq |q)
rq
∑
= p(s|r)β (r |rq )p(rq |q),
r ,rq
The mutual information shared between the initial and final configurations is
[ ]
∑ p(s|q)
I(S : Q ) = p(s|q)pq log . (70)
ps
s,q
Using (69), the difference between these two quantities can be written as
( [ ] [ ])
∑ p(s|q) p(r |q)
I(S : Q ) − I(R : Q ) = p(s|r)p(r |q)pq log − log
ps pr
s,q,r
[ ]
∑ p(s|q)pr
= p(s|r)p(r |q)pq log .
ps p(r |q)
s,q,r
6 Note that we could have instead inserted the projective measurement step along the {P̂ } basis into the map E
r r R←Q .
Either way, we demonstrate the existence of a set of maps satisfying the consistency condition (65).
11
J.A. Barandes and D. Kagan Annals of Physics 448 (2023) 169192
Let us recall the statement of Holevo’s bound. Consider a quantum system and let X be a
classical random variable with possible outcomes {x}x and corresponding probability distribution
{px }x . Suppose that {ρ̂x }x is a collection of density matrices indexed by the possible outcomes x of
X , and let ρ̂ be the correspondingly averaged density matrix:
∑
ρ̂ ≡ px ρ̂x . (73)
x
If we now measure a POVM {EY }y whose possible outcomes y form another classical random variable
Y , then Holevo’s bound states that the classical mutual information between X and Y is bounded
from above by the quantity
∑
χ ≡ S(ρ̂ ) − px S(ρ̂x ). (74)
x
That is,
I(X : Y ) ≤ χ . (75)
In the two-step process described in Section 4.3, the mutual information between the initial
configuration ρ̂Q and the intermediate configuration ρ̂R can be expressed as
∑
I(R : Q ) = S(ρ̂R ) − J(R|Q ) = S(ρ̂R ) − pq S ρ̂R|q ,
( )
(76)
q
where
∑
ρ̂R|q = P̂r ER←Q P̂q P̂r .
{ }
(77)
r
The quantity on the right-hand side of (76) is clearly an example of Holevo’s χ quantity. We see
that it emerges quite naturally as an example of our newly defined mutual information, and that
Holevo’s bound (75) arises as a manifestation of our quantum data-processing inequality (72). The
Holevo bound’s interpretation as a quantum version of the data-processing inequality has been
discussed before (see for example [11]). Our dynamical interpretation of the bound provides another
7 Jensen’s inequality states that if f (x) is a convex function of its argument x, then the average of f (x) provides an
upper bound for the original function applied to the average of its argument. Here we apply Jensen’s inequality to − log x.
12
J.A. Barandes and D. Kagan Annals of Physics 448 (2023) 169192
perspective that avoids any explicit embedding of the system of interest into a larger composite
system. Instead, we capture the role of the broader environment through the formalism of linear
CPTP maps.
5. Discussion
Our focus in this paper has been on a dynamical interpretation of quantum information in a
system whose evolution is described by a linear CPTP map. However, as mentioned in Section 3.1,
the formalism is general enough to capture structural relationships between composite quantum
systems and their subsystems. To begin, consider the parent system AB formed from a pair of
quantum subsystems A and B and described by the density matrix
∑
ρ̂AB = m P̂m ,
pAB AB AB
P̂m = |ΨmAB ⟩⟨ΨmAB |, (78)
m
AB
where we include the parent system’s label AB on the system’s eigenprojectors P̂m and the corre-
sponding probabilities pm . The subsystem density matrices are related to ρ̂AB via the appropriate
AB
partial traces,
∑ ∑
ρ̂A = TrB [ρ̂AB ] = pAa P̂aA , ρ̂B = TrA [ρ̂AB ] = pBb P̂bB , (79)
a b
where the sets of eigenprojectors for subsystems A and B are {P̂aA }a and {P̂bB }b , respectively.
Quantum probabilities that conditionally link subsystem eigenstates to a given eigenstate of the
parent system are again defined using (15), substituting the relevant partial trace for the linear CPTP
map in the formula. For instance, the conditional probability that |ΨaA ⟩ is the actual underlying state
of subsystem A given that the underlying state of AB is |ΨmAB ⟩ is8
As in Section 3.1, the partial trace applied to system AB’s eigenprojector yields a density matrix
[ ] ∑
ρ̂mA = TrB P̂mAB = p(am |m)P̂aAm . (81)
am
We have
∑ ∑ ∑
ρ̂A = pm ρ̂m
A
= pm ρ̂A|m , ρ̂A|m = P̂aA ρ̂m P̂a .
A A
(82)
m m a
These relationships imply that the quantum entropy conditioned on the parent state |ΨmAB ⟩ satisfies
the inequality
∑
S(ρ̂m
A
) ≤ J(A|m) = − p(a|m) log p(a|m) = S(ρ̂A|m ), (83)
a
due to the quantities p(a|m) and p(am |m) being related via the doubly stochastic distribution
2
β (a|am ) = |⟨ΨaA |ΨaAm ⟩| . (84)
8 We again adopt the language of the minimal modal interpretation, though the mathematical content involves only
textbook quantum theory.
13
J.A. Barandes and D. Kagan Annals of Physics 448 (2023) 169192
and to note that it is naturally interpreted as the entanglement entropy of subsystem A conditioned
on the parent system AB actually occupying the pure state |ΨmAB ⟩. Note that when the parent system
is in a pure state, then ρ̂m
A
= ρ̂A|m and J(A|m) is the entanglement entropy of subsystem A.
The full quantum conditional entropy is defined as
∑
J(A|AB) = pm J(A|m). (86)
m
There are also intriguing relationships between our quantum conditional entropy ((33) and (34))
and conditional von Neumann entropy (11). Observe that the inequality (47) satisfied by our version
of quantum mutual information can be re-expressed as
S(ρ̂A ) − J(A|AB) ≤ S(ρ̂AB ), (88)
where the initial density matrix is taken to be ρ̂AB and the final density matrix is ρ̂A . Rearranging
terms and applying the definition of conditional von Neumann entropy yields
−S(B|A) ≤ J(A|AB). (89)
In the presence of entanglement, S(B|A) may take on negative values, leading to a positive lower
bound on J(A|AB). The result naturally captures the idea that when subsystems are entangled,
there is a non-zero minimal uncertainty about their states even given information about the parent
system.
Our definition of quantum conditional probability (15) involves the eigenprojectors of initial and
final density matrices (13) and (14), respectively. However, as we described in Section 2, there are
infinitely many decompositions of a nontrivial density matrix. Thus, we may consider quantities of
the form
PE (ρ|κ ) = Tr Π̂ρR ER←Q Π̂κQ ,
[ { }]
(90)
where
∑ ∑
ρ̂Q = λQκ Π̂κQ , ρ̂R = λRρ Π̂ρR (91)
κ ρ
are general convex decompositions of the system’s initial and final density matrices, respectively,
with generic projection operators
Π̂κQ = |ΦκQ ⟩⟨ΦκQ |, Π̂ρR = |ΦρR ⟩⟨ΦρR |. (92)
Note that such sets of projectors need not be orthogonal. However, if we demand that the quantities
(90) behave as probabilities, then the set {Π̂ρR }ρ must resolve the identity:
∑
Π̂ρR = I. (93)
ρ
Nevertheless, these quantities fail to act as fully satisfactory conditional probabilities, as they do
not obey a straightforward version of the law of total probability. Instead we have
ΛRρ ≡ Tr Π̂ρR ρ̂R
[ ]
and thus
∑
ΛRρ = PE (ρ|κ )λQκ , (95)
κ
where we generically have ΛRρ ̸ = λRρ due to the possible nonorthogonality of the projectors.
Despite their shortcomings as proper conditional probability distributions, the quantities defined
in (90) may yet be of some interest for reasons we detail in Section 7.
In [14], Esposito and Mukamel investigate definitions of work and heat, entropy production,
and fluctuation theorems in the context of open quantum systems. The authors’ results rest on
their construction of quantum transition matrices that can be understood in terms of the quantum
conditional probabilities (15) used in this work. To see this connection, first we follow [14] and
describe the evolution of an open quantum system in terms of a differential linear CPTP map K
that defines the time evolution of the system’s density matrix,
dρ̂Q
= K{ρ̂Q (t)}. (96)
dt
The quantum transition matrices of [14] can be expressed as
[ ]
w((q′ |q); t) ≡ Tr P̂q′ K{P̂q } . (97)
These transition rates satisfy a differential version of the law of total probability,
dpq′ ∑
= w((q′ |q); t)pq , (98)
dt
q
In Section 3.1, we argued that our quantum conditional probabilities (15) do not generically
satisfy Bayes’ theorem due to the possible irreversibility of the linear CPTP map ER←Q on which
their definition depends. The implications for retrodiction—inference about past states given present
conditions—are nuanced. While ER←Q may not be reversible, there may be situations in which a
reverse evolution map can be defined, as explored in the context of quantum fluctuation theorems
by Aw, Buscemi, and Scarini [15], which appeared while this work was in preparation. Nonetheless,
the generic asymmetry inherent in the definition (15) typically precludes any retrodiction based on
our formulation of quantum conditional probabilities.9
In this work, we utilized quantum conditional probabilities (15) that were first developed in [5]
to define new forms of quantum conditional entropy (34) and quantum mutual information (35).
We explored how these quantities capture growth of entropy and loss of information as an open
quantum system evolves according to a linear CPTP evolution map.
Thanks to the existence of an underlying conditional probability distribution, we were able to
provide conceptually clear proofs of identities and inequalities satisfied by our quantum conditional
entropy and mutual information, analogous to those satisfied by their classical counterparts. By
contrast, the traditional von Neumann conditional entropy and mutual information generically lack
any underlying conditional probabilities, rendering their definitions and relationships conceptually
unclear.
One limitation of our approach is that our quantum conditional probabilities depend for their
definition on the existence of a well-defined linear CPTP map. For some of the results proved in this
paper including (51), this limitation is benign because the claim itself is about a sub-class of linear
CPTP dynamics. For other proofs in this paper, like the concavity of von Neumann entropy (53), we
were able to introduce a linear CPTP map by hand without any loss of generality.
However, our derivation of the quantum data processing inequality (72) depended on the
dynamics being described by a chain of linear CPTP maps. The same is therefore true for our Holevo-
type bound in (75), with χ given by the expression on the right-hand side of (76). In general, these
sorts of inequalities do appear to depend on the dynamics being at least embeddable in some linear
CPTP map [2]. It would be interesting to explore whether our approach could be used to study more
general forms of dynamics that can be systematically approximated as analytically or numerically
controllable deviations from linear CPTP dynamics.
In light of the connections between our work and works such as [14], as described in detail in
Section 6.2, it would be interesting to explore the ways our quantum conditional entropies and our
other results, including our quantum data-processing inequality, may be applied in understanding
open-quantum system entropy growth and fluctuation theorems.
Section 5.1 explored intriguing connections between our quantum conditional probabilities and
standard quantum information-theoretic concepts that arise from the rich structure of system–
subsystem relationships in quantum theory. In future work, we will continue to explore these
connections, along with related concepts, such as quantum discord [18].
9 Watanabe raised questions about retrodiction even in situations where Bayes’ theorem is assumed to hold [16].
Others have attempted to address the generic time asymmetries in standard quantum theory by formulating a retrodictive
quantum theory. (See for example [17] and references therein.) Our formulation, by contrast, is built from standard
elements of quantum theory, and thus time asymmetries having to do with measurement processes or other open-system
dynamics are unavoidable.
16
J.A. Barandes and D. Kagan Annals of Physics 448 (2023) 169192
Despite their failure to reproduce the law of total probability, the quantities (90) do satisfy the
Kolmogorov axioms for a basic probability distribution. They are also examples of more general
quantities of the form
Strong subadditivity can then be used to prove many of the other properties satisfied by quantum
entropies and related quantities. Furthermore, the surprising results of [20] can also be seen as a
reflection of the strong subadditivity of von Neumann entropy. Given these wide-ranging areas, we
are quite interested in exploring whether our quantum conditional probabilities and their associated
quantum conditional entropy can provide some new perspectives on strong subadditivity, and hence
shed some light on recent developments at the intersection of quantum information and quantum
gravity.
Jacob A. Barandes: Conceptualization, Formal analysis, Writing – review & editing. David Kagan:
Conceptualization, Methodology, Formal analysis, Writing – original draft, Funding acquisition.
The authors declare that they have no known competing financial interests or personal relation-
ships that could have appeared to influence the work reported in this paper.
Data availability
Acknowledgments
We thank our departmental colleagues and staff for supporting our work. D.K. thanks Darya
Krym for useful discussions. Part of this work was supported by the UMass Dartmouth Marine and
Undersea Technology Research Program (MUST) sponsored by the Office of Naval Research (ONR)
under grant N00014-22-1-2012. We would also like to thank our anonymous reviewers for their
insightful comments, which improved our paper.
17
J.A. Barandes and D. Kagan Annals of Physics 448 (2023) 169192
then the Shannon entropy of p(y) is greater than or equal to that of p(x). To see why, consider their
difference:
( )
∑ p(y)
H(X ) − H(Y ) = p(y|x)p(x) log . (106)
p(x)
x, y
and hence
as claimed.
While we have explicitly proved this result using classical notation, the proof applies to von
Neumann entropies linked via the quantum conditional probabilities (15) defined in Section 3.1.
Non-negativity
The non-negativity of quantum conditional entropy follows directly from its construction from
non-negative conditional probabilities that cannot be greater than one. Non-negativity of our form
of quantum mutual information arises by applying Jensen inequality to the definition (35):
[ ] [ ]
∑ pr ∑ pr
IE (R : Q ) = − p(r |q)pq log ≥ − log p(r |q)pq = − log(1) = 0. (110)
p(r |q) p(r |q)
q ,r q,r
The difference between the quantum mutual information shared by the initial and final config-
urations, on the one hand, and the von Neumann entropy of the initial density matrix (13), on the
18
J.A. Barandes and D. Kagan Annals of Physics 448 (2023) 169192
other hand, is
[ ]
∑ pr ∑
IE (R : Q ) − S(ρ̂Q ) = − p(r |q)pq log + pq log pq
p(r |q)
q ,r q
[ ]
∑ p(r |q)pq
= p(r |q)pq log . (111)
pr
q,r
pr ≥ pE (r |q)pq . (112)
Thus, the monotonicity of the logarithm implies that
[ ]
∑ pr
IE (R : Q ) − S(ρ̂Q ) ≤ p(r |q)pq log = 0. (113)
pr
q ,r
References
19