Matrix Power Mean and Karcher Mean
Matrix Power Mean and Karcher Mean
com
Abstract
We define a new family of matrix means {Pt (ω; A)}t∈[−1,1] , where ω and A vary over all positive prob-
ability vectors in Rn and n-tuples of positive definite matrices resp. Each of these means except t = 0
arises as a unique positive definite solution of a non-linear matrix equation, satisfies all desirable properties
of power means of positive real numbers and interpolates between the weighted harmonic and arithmetic
means. The main result is that the Karcher mean coincides with the limit of power means as t → 0. This
provides not only a sequence of matrix means converging to the Karcher mean, but also a simple proof
of the monotonicity of the Karcher mean, conjectured by Bhatia and Holbrook, and other new properties,
which have recently been established by Lawson and Lim and also Bhatia and Karandikar using probabilis-
tic methods on the metric structure of positive definite matrices equipped with the trace metric.
© 2011 Elsevier Inc. All rights reserved.
Keywords: Positive definite matrix; Geometric mean; Monotonicity; Riemannian trace metric; Metric nonpositive
curvature; Thompson metric; Power mean; Riemannian barycenter
1. Introduction
The Riemannian trace metric on the convex cone P = Pm of m × m positive definite Hermitian
matrices plays an important role in many applied areas involving matrix interpolation, filtering,
estimation, optimization and averaging, where it has been increasingly recognized that the Eu-
clidean distance is often not the most suitable for the set P and that working with the appropriate
* Corresponding author.
E-mail addresses: [email protected] (Y. Lim), [email protected] (M. Pálfia).
0022-1236/$ – see front matter © 2011 Elsevier Inc. All rights reserved.
doi:10.1016/j.jfa.2011.11.012
Y. Lim, M. Pálfia / Journal of Functional Analysis 262 (2012) 1498–1514 1499
geometry does matter in computational problems. (Recall the trace metric distance between two
1
positive definite matrices is given by δ(A, B) = ( ki=1 log2 λi (A−1 B)) 2 , where λi (X) denotes
the i-th eigenvalue of X in ascending order.) It turns out that the Riemannian geometry plays a
key role particularly in the study of inversion invariant data averaging procedures in image pro-
cessing, in radar detection and in brain-computer interfacing [5,4,29]. An attractive candidate of
data averaging procedures is the least squares mean [24] of positive definite matrices. This mean
has appeared under a variety of other designations: Frechet mean, Cartan mean, Riemannian
center of mass [18], Riemannian geometric mean [29], or frequently, Karcher mean [14], the
terminology we adopt. The Karcher mean of n positive definite matrices A1 , . . . , An is defined
as the unique minimizer (provided it exists) of the sum of squares of the Riemannian trace metric
distances to each of the Ai , i.e.,
n
Λ(A1 , . . . , An ) = arg min δ 2 (X, Ai ). (1.1)
X∈P i=1
This idea had been anticipated by Élie Cartan (see, for example, Section 6.1.5 of [6]), who
showed among other things such a unique minimizer exists if the points all lie in a convex ball in
a Riemannian manifold; see also Karcher’s paper [18]. Using Karcher’s formula for the gradient
of the objective function (Theorem 2.1 of [18]) or computing appropriate derivatives as in [10,28]
yields that the Karcher mean coincides with the unique positive definite solution of the Karcher
equation
n
log X 1/2 A−1
i X
1/2
= 0. (1.2)
i=1
Various numerical methods for the solution of (1.1) or (1.2) have been introduced in the liter-
ature: fixed point methods, optimization algorithms like Newton’s method or a gradient descent
method, and iterative methods; see [14] and references therein. Unfortunately neither an ex-
plicit expression nor an explicit sequence of matrix means converging directly to the Karcher
mean is known. Nevertheless the monotonicity of the Karcher mean, conjectured by Bhatia and
Holbrook [11] and one of key axiomatic properties of matrix geometric means, was recently
established by Lawson and Lim [24] via a probabilistic convergence of approximations and by
Bhatia and Karandikar [12] via some probabilistic counting arguments, both arguments depend-
ing heavily on basic inequalities for the Riemannian metric. In this paper we provide a more
direct, non-probabilistic proof of the monotonicity of the Karcher mean that depends on finding
a sequence of matrix means satisfying monotonicity that converge directly to the Karcher mean.
The principal goal of this paper is to construct a particular family of matrix means, each with nu-
merous desirable properties such as monotonicity, that converges to the Karcher mean and show
that these properties are preserved in the limit.
a t +···+a t 1
The basic family of means we consider are the power means. The power mean ( 1 n n ) t
of n positive real
numbers a1 , . . . , an arises as the unique positive solution of the elementary
equation x = n1 ni=1 x 1−t ait , and converges to the geometric mean of a1 , . . . , an as t → 0. In
this paper we consider a matrix analogue of x = n1 ni=1 x 1−t ait , namely
1
n
X= X #t Ai , (1.3)
n
i=1
1500 Y. Lim, M. Pálfia / Journal of Functional Analysis 262 (2012) 1498–1514
where A #t B = A1/2 (A−1/2 BA−1/2 )t A1/2 , the t-weighted geometric mean of A and B. We
prove that for each t ∈ (0, 1], Eq. (1.3) has a unique positive definite solution, denoted by
Pt (A1 , . . . , An ), and show that each of these matrix means (called a power mean) arises as
a unique fixed point of a strict contraction for the Thompson metric. We show these power
means vary continuously with t and satisfy analogues of basic properties of power means
of positive real numbers (e.g., monotonicity and joint concavity). We then establish that the
Karcher mean is the limit of power means as t → 0. This gives, in particular, a simple and
non-probabilistic proof of monotonicity, joint concavity and other new properties of the Karcher
mean recently established by Bhatia and Karandikar [12], and a globally convergent method for
obtaining the Karcher mean by taking the limit of Xk = P 1 (A1 , . . . , An ). Moreover, together
k
with P−t (A1 , . . . , An ) := Pt (A−1 −1 −1
1 , . . . , An ) , this provides a complete extension of the power
means of positive reals to positive definite matrices in the sense that the family of matrix means
{Pt (A1 , . . . , An )}t∈[−1,1] interpolates continuously between the harmonic (t = −1) and arith-
metic (t = 1) means with the Karcher mean appearing at t = 0.
Let H be the space of Hermitian matrices of a fixed size m, and P the convex cone of positive
definite Hermitian matrices. For X, Y ∈ H, we write that X Y if Y − X is positive semidefinite,
and X < Y if Y − X is positive definite. The Frobenius norm · 2 gives rise to the Riemannian
structure on P: X, Y A = Tr(A−1 XA−1 Y ), where A ∈ P, X, Y ∈ TA (P) = H. The Rieman-
−1 1
nian metric distance is given by δ(A, B) = [ m 2
i=1 log λi (A B)] , where the λi (X) denote
2
(i) A #t B = A1−t B t for AB = BA, and (aA) #t (bB) = a 1−t bt (A #t B) for a, b > 0;
(ii) (Löwner–Heinz inequality) A #t B C #t D for A C, B D and t ∈ [0, 1];
(iii) M(A #t B)M ∗ = (MAM ∗ ) #t (MBM ∗ ) for any non-singular M;
(iv) A #t B = B #1−t A, (A #t B)−1 = A−1 #t B −1 ;
(v) (λA + (1 − λ)B) #t (λC + (1 − λ)D) λ(A #t C) + (1 − λ)(B #t D) for λ, t ∈ [0, 1];
(vi) det(A #t B) = det(A)1−t det(B)t ; and
(vii) ((1 − t)A−1 + tB −1 )−1 A #t B (1 − t)A + tB for t ∈ [0, 1].
The Thompson metric on P is defined by d∞ (A, B) = log(A−1/2 BA−1/2 )∞ , where X∞
denotes the spectral norm of X. It is known that d∞ is a complete metric on P and d∞ (A, B) =
max{log M(B/A), log M(A/B)}, where M(B/A) = inf{α > 0: B αA} = λ1 (A−1/2 BA−1/2 ),
the largest eigenvalue of A−1/2 BA−1/2 . See [32,16].
The following non-expansive property of addition for the Thompson metric will be useful for
our purpose.
n
X= wi (X #ti Ai ). (3.4)
i=1
Definition 3.2 (Matrix power means). Let A = (A1 , . . . , An ) ∈ Pn and ω ∈ n . For t ∈ (0, 1], we
denote by Pt (ω; A) the unique solution of
n
X= wi (X #t Ai ). (3.5)
i=1
For t ∈ [−1, 0), we define Pt (ω; A) = P−t (ω; A−1 )−1 , where A−1 = (A−1 −1
1 , . . . , An ). We call
Pt (ω; A) the ω-weighted power mean of order t of A1 , . . . , An . To simplify the notation we write
Pt (A) = Pt (1/n, . . . , 1/n; A).
1502 Y. Lim, M. Pálfia / Journal of Functional Analysis 262 (2012) 1498–1514
Remark 3.3. We note that P1 (ω; A) = ni=1 wi Ai and P−1 (ω; A) = ( ni=1 wi A−1 −1
i ) , the ω-
weighted arithmetic and harmonic means of A1 , . . . , An , respectively. For t ∈ [−1, 0), Pt (ω; A)
is the unique positive definite solution of
−1
n
X= wi (X #−t Ai )−1 . (3.6)
i=1
n
Indeed, X −1 = i=1 wi (X
−1 #−t A−1
i ) if and only if X
−1 = P (ω; A−1 ).
−t
Remark 3.4. Let f : P → P defined by f (X) = ni=1 wi (X #t Ai ), t ∈ (0, 1]. Then by the
Löwner–Heinz inequality, f is monotone: X Y implies that f (X) f (Y ). By Theorem 3.1,
f is a strict contraction for the Thompson metric with the least contraction coefficient less than
or equal to 1 − t.
(13) For s ∈ (0, 1], Pt (ω; X #s A1 , . . . , X #s An ) = X if and only if X = Pst (ω; A);
(14) If t ∈ (0, 1], then Φ(Pt (ω; A)) Pt (ω; Φ(A)) for any positive unital linear map Φ, where
Φ(A) = (Φ(A1 ), . . . , Φ(An )). If t ∈ [−1, 0), then Φ(Pt (ω; A)) Pt (ω; Φ(A)) for any
strictly positive unital linear map Φ; and
(15) For any unitarily invariant norm ||| · ||| and t ∈ (0, 1],
1
− 1t
n t
n
t
Pt (ω; A) wi |||Ai |||t and P−t (ω; A) wi A−1 .
i
i=1 i=1
Proof. (1) Suppose that the Ai ’s commute. Let t ∈ (0, 1] and X = ( ni=1 wi Ati )1/t . Then
X #t Ai = X 1−t Ati and ni=1 wi (X #t Ai ) = nj=1 wi X 1−t Ati = X 1−t ni=1 Ati = X 1−t X t = X.
By uniqueness, ( ni=1 wi Ati )1/t = X = Pt (ω; A). Furthermore, P−t (ω; A) = Pt (ω; A−1 )−1 =
( ni=1 wi A−t
i )
−1/t .
1
(2) Let t ∈ (0, 1]. Set β = ( ni=1 wi ait ) t , ζ = ω at ∈ n and X = Pt (ω at ; A). Then
ζi = (ω at )i = n 1w a t wi ait and X = ni=1 ζi (X #t Ai ). Therefore,
i=1 i i
n
n
n
wi (βX) #t (ai Ai ) = β 1−t wi ait (X #t Ai ) = β 1−t ζi β t (X #t Ai )
i=1 i=1 i=1
n
=β ζi (X #t Ai ) = βX.
i=1
1
By uniqueness, ( ni=1 wi ait ) t Pt (ω at ; A) = βX = Pt (ω; a · A).
For t ∈ [−1, 0), we have
− 1 −1
−1
n t
Pt (ω; a · A) = P−t ω; (a · A)−1 = wi ait P−t ω a ;A
t −1
i=1
1 1
n t
n t
−1 −1
= wi ait P−t ω a ;A
t
= wi ait Pt ω at ; A .
i=1 i=1
n
n
f (X) = wi (X #t Ai ) and g(X) = wi (X #t Bi ).
i=1 i=1
Then Pt (ω; A) = limk→∞ f k (X) and Pt (ω; B) = limk→∞ g k (X) for any X ∈ P, by the Ba-
nach fixed point theorem. By the Löwner–Heinz inequality, f (X) g(X) for all X ∈ P,
and f (X) f (Y ), g(X) g(Y ) whenever X Y . Let X0 > 0. Then f (X0 ) g(X0 ) and
f 2 (X0 ) = f (f (X0 )) g(f (X0 )) g 2 (X0 ). Inductively, we have f k (X0 ) g k (X0 ) for all
k ∈ N. Therefore, Pt (ω; A) = limk→∞ f k (X0 ) limk→∞ g k (X0 ) = Pt (ω; B).
1504 Y. Lim, M. Pálfia / Journal of Functional Analysis 262 (2012) 1498–1514
Let t ∈ [−1, 0). Then A−1 B−1 and thus P−t (ω; A−1 ) P−t (ω; B−1 ). Therefore,
Pt (ω; A) = P−t (ω; A−1 )−1 P−t (ω; B−1 )−1 = Pt (ω; B).
(5) Let t ∈ (0, 1]. Let X = Pt (ω; A) and Y = Pt (ω; B). Then by Lemma 2.2 and Lemma 2.3,
n
n
d∞ (X, Y ) = d∞ wi (X #t Ai ), wi (Y #t Bi )
i=1 i=1
max d∞ (X #t Ai , Y #t Bi ) max (1 − t)d∞ (X, Y ) + td∞ (Ai , Bi )
1in 1in
= (1 − t)d∞ (X, Y ) + t max d∞ (Ai , Bi ) ,
1in
which implies that d∞ (X, Y ) max1in {d∞ (Ai , Bi )}. Since d∞ is invariant under inversion,
we also have
−1 −1
d∞ P−t (ω; A), P−t (ω; B) = d∞ Pt ω; A−1 , Pt ω; B−1
= d∞ Pt ω; A−1 , Pt ω; B−1
−1
max d∞ A−1 i , Bi = max d∞ (Ai , Bi ) .
1in 1in
(6) Let t ∈ (0, 1]. Let X = Pt (ω; A) and Y = Pt (ω; B). For u ∈ [0, 1], we set Zu =
(1 − u)X + uY . Let f (Z) = ni=1 wi (Z #t ((1 − u)Ai + uBi )). Then by the joint concavity
of the two-variable geometric mean
n
Zu = (1 − u)X + uY = wi (1 − u)(X #t Ai ) + u(Y #t Bi )
i=1
n
wi (1 − u)X + uY #t (1 − u)Ai + uBi = f (Zu ).
i=1
−1 −1
Det P−t (ω; A) = Det Pt ω; A−1 = Det P−t ω; A−1
n
−1 −wi n
Det Ai = Det(Ai )wi .
i=1 i=1
(10) Let t ∈ (0, 1]. Let X = Pt (ω; A). By using the two-variable weighted arithmetic-
geometric mean inequality, we obtain
n
n
n
X= wi (X #t Ai ) wi (1 − t)X + tAi = (1 − t)X + t wi Ai ,
i=1 i=1 i=1
n
which implies that X i=1 wi Ai . Similarly,
−1 −1
n
n
n
X= wi (X #t Ai ) wi (X #t Ai )−1 = wi X −1 #t A−1
i .
i=1 i=1 i=1
n
n
n
−1
X −1 wi X −1 #t A−1
i w i (1 − t)X −1
+ tAi = (1 − t)X −1
+ t wi A−1
i ,
i=1 i=1 i=1
which implies that X ( ni=1 wi A−1 −1
i ) .
The case t ∈ [−1, 0) holds by duality.
(11) Let t ∈ (0, 1] and let X = Pt (ω; A). Then
n
n
1 n
X= wi (X #t Ai ) = wi (X #t Ai ) + · · · + wi (X #t Ai )
k
i=1 i=1 i=1
k
n
n
Φ(Xt ) = wi Φ(Xt #t Ai ) wi Φ(Xt ) #t Φ(Ai ) . (3.7)
i=1 i=1
1506 Y. Lim, M. Pálfia / Journal of Functional Analysis 262 (2012) 1498–1514
Define f (X) = ni=1 wi (X #t Φ(Ai )). Then limk→∞ f k (X) = Pt (ω; Φ(A)) for any X > 0.
By (3.7), f (Φ(Xt )) Φ(Xt ). Since f is monotonic, f k (Φ(Xt )) Φ(Xt ) for all k ∈ N. Thus,
Pt (ω; Φ(A)) = limk→∞ f k (Φ(Xt )) Φ(Xt ) = Φ(Pt (ω; A)).
Let t ∈ [−1, 0) and let Φ be a strictly positive unital linear map. By Choi’s inequality (The-
orem 2.3.6 of [10]), Φ(A)−1 Φ(A−1 ) for all A > 0. By (4) and the preceding paragraph,
Φ(P−t (ω; A−1 )) P−t (ω; Φ(A−1 )) P−t (ω; Φ(A)−1 ). This implies that Φ(Pt (ω; A)) =
Φ(P−t (ω; A−1 )−1 ) Φ(P−t (ω; A−1 ))−1 P−t (ω; Φ(A)−1 )−1 = Pt (ω; Φ(A)).
(15) Let t ∈ (0, 1] and X = Pt (ω; A). Then
n
n
n
|||X||| wi |||X #t Ai ||| wi |||X|||1−t |||Ai |||t = |||X|||1−t wi |||Ai |||t ,
i=1 i=1 i=1
where the second inequality follows from Theorem 2.10 of [27] and Corollary IX.5.3 [8], and
1
hence |||Pt (ω; A)||| ( ni=1 wi |||Ai |||t ) t . Since |||A−1 ||| |||A|||−1 for any A > 0, |||P−t (ω; A)||| =
|||Pt (ω; A−1 )−1 ||| |||Pt (ω; A−1 )|||−1 [ ni=1 wi |||A−1 t − 1t . 2
i ||| ]
Remark 3.6. From the (AGH) inequalities (Proposition 3.5(10)) we can obtain other inequalities
from operator monotone functions on the positive reals. Let f : (0, ∞) → (0, ∞) be an operator
monotone increasing (resp. decreasing) function. Then
n
n
Pt ω; f (A) wi f (Ai ) f wi Ai ,
i=1 i=1
−1
n
n
Pt ω; f (A) wi f (Ai )−1 f wi Ai ,
i=1 i=1
respectively; these follow from the equivalence between operator monotonicity and operator log-
concavity by Ando and Hiai [1].
Property (12) implies in particular that Pt (ω̂; A1 , . . . , An−1 ) is the unique fixed point of the
map f (X) = Pt (ω; A1 , . . . , An−1 , X). By Proposition 3.5(5), f is a non-expansive map for the
Thompson metric.
Corollary 3.7. Let t ∈ [−1, 1] \ {0}, ω ∈ n and let A1 , . . . , An−1 ∈ P. Then there exists
X0 ∈ P such that limk→∞ f k (X0 ) = Pt (ω̂; A1 , . . . , An−1 ), where f : P → P is defined by
f (X) = Pt (ω; A1 , . . . , An−1 , X). Furthermore, for B ∈ P,
Proof. Let f (X) = Pt (ω; A1 , . . . , An−1 , X). By Proposition 3.5(4), f is monotonic. Let B ∈ P.
Pick α, β > 0 such that B, Ai ∈ [βI, αI ] = {X ∈ P: βI X αI } for all i = 1, . . . , n − 1. Then
by Proposition 3.5(10), f maps [βI, αI ] into itself. Indeed,
Y. Lim, M. Pálfia / Journal of Functional Analysis 262 (2012) 1498–1514 1507
for any X ∈ [βI, αI ]. So, f k (X0 ) ∈ [βI, αI ] for all k ∈ N and for any X0 ∈ [βI, αI ]. Let
X0 ∈ [βI, αI ] such that f (X0 ) X0 . Then by induction f k+1 (X0 ) f k (X0 ) for all k ∈ N.
That is, {f k (X0 )}∞ k=1 is a decreasing sequence bounded below by βI and thus converges to
Pt (ω̂; A1 , . . . , An−1 ), which is the unique fixed point of f . In particular for X0 = αI , we have
that f (αI ) αI and limk→∞ f k (αI ) = Pt (ω̂; A1 , . . . , An−1 ).
Suppose that f (B) B. Then f k+1 (B) f k (B) B for all k ∈ N and hence Pt (ω̂; A1 , . . . ,
An−1 ) = limk→∞ f k (B) f (B) B. 2
t 1
Pt (w1 , w2 ; A, B) = A # 1 w1 A + w2 (A #t B) = A1/2 w1 I + w2 A−1/2 BA−1/2 t A1/2 .
t
n n k n−k
In particular, P 1 (w1 , w2 ; A, B) = k=0 k w1 w2 (B # k A).
n n
1 t 1
X = A1/2 U A1/2 = A1/2 w1 I + w2 Z t t A1/2 = A1/2 w1 I + w2 A−1/2 BA−1/2 t A1/2
t 1/2
= A1/2 I #1/t w1 I + w2 A−1/2 BA−1/2 A = A #1/t w1 A + w2 (A #t B) .
If t = 1
n for some n ∈ N, then
n
1 n n n−k
A−1/2 XA−1/2 = U = w1 I + w2 Z n = w1k w2n−k Z n
k
i=1
and hence
n
n
n n−k n n−k
X=A 1/2
w1k w2n−k Z n A1/2
= w1k w2n−k A1/2 Z n A1/2
k k
i=1 i=1
n
n
n n
= w1k w2n−k (A # n−k B) = w1k w2n−k (B # k A). 2
k n k n
i=1 i=1
1
A−1/2 A # 1 w1 A + w2 (A #t B) A−1/2 = w1 I + w2 Z t t → Z w2 . (3.8)
t
That is, Pt (w1 , w2 ; A, B) → A1/2 Z w2 A1/2 = A1/2 (A−1/2 BA−1/2 )w2 A1/2 = A #w2 B. This fur-
ther implies that
−1
lim Pt (w1 , w2 ; A, B) = lim P−t w1 , w2 ; A−1 , B −1
t→0− t→0−
−1
= A−1 #w2 B −1 = A #w2 B.
Remark 3.10. We note that the power mean Pt (A, B) coincides with the operator mean aris-
ing from ft : (0, ∞) → (0, ∞) defined by ft (x) = ( x 2+1 ) t . It is called the quasi-arithmetic
t 1
(power) mean of order t. Its operator monotonicity (cf. Proposition 3.5), infinite divisibility,
and the complete positivity of an associated linear operator have been studied by Bhatia and
Kosaki [13] and Besenyei and Petz [7]. It turns out [31] that the power mean Pt (A, B) arises
as the midpoint operation of a manifold equipped with an affine connection. One can also see
1
that Pt (w1 , w2 ; A, B)(= A # 1 [w1 A + w2 (A #t B)]) = (w1 At + w2 B t ) t for non-commuting A
t
At +···+At 1
and B. In fact, = exp( log A1 +···+log
limt→0 ( 1 n n ) t n
An
) and is known as the Log-Euclidean
mean [3]. We note that the Log-Euclidean mean is far from the geometric mean A # B for
n = 2.
n
wi log X −1/2 Ai X −1/2 = 0. (4.9)
i=1
We note from (4.9) that Λ(ω; A−1 )−1 = Λ(ω; A), the self-duality of the Karcher mean.
Lemma 4.1. Let D ⊂ R be an open interval and let 0 > 0. Let F : D × (− 0 , 0) → R be a map
satisfying
Proof. Let a ∈ D and let an be a sequence converging to a. Let > 0. Since f is a continuous
and increasing function, there exists δ > 0 such that
Since an → a, there exists N1 > 0 such that a − δ < an < a + δ for all n N1 . Since F is an
increasing function in the first variable, F (a − δ, 1/n) F (an , 1/n) F (a + δ, 1/n) for all
n N1 . That is, for all n N1 ,
n F (a − δ, 1/n) − F (a, 0) n F (an , 1/n) − F (a, 0) n F (a + δ, 1/n) − F (a, 0) .
one can find N2 > N1 such that for all n N2 , f (a) − < n(F (a ± δ, 1/n) − F (a, 0)) and
f (a) + > n(F (a + δ, 1/n) − F (a, 0)). This completes the proof. 2
Since the map F (x, t) = x t on (0, ∞) × R satisfies the conditions in Lemma 4.1, we have the
following result.
Lemma 4.2. Let x0 > 0 and let xn be a sequence of positive real numbers converging to x0 . Then
1/n
limn→∞ n(xn − 1) = log x0 .
n
−1/2 −1/2
n
−1/2 −1/2 t
I= wi X t (Xt #t Ai )Xt = wi X t Ai Xt .
i=1 i=1
n −1/2 −1/2 t
(Xt Ai Xt ) −I
0= wi . (4.11)
t
i=1
Let {tk }∞
k=1 be a sequence in (0, 1] converging to 0. Since Xt lies in the order interval
determined by the ω-weighted harmonic mean and arithmetic mean, which is compact, the
sequence {Xtk } has at least one limit point. Suppose that X0 is a limit point of {Xtk }. We
will show that X0 = Λ(ω; A). Passing to a subsequence, we may assume that Xtk → X0 , as
1510 Y. Lim, M. Pálfia / Journal of Functional Analysis 262 (2012) 1498–1514
−1/2 −1/2 tk ∗
(Xtk Ai Xtk ) −I (Ytk )tk − I Utk (Dtk )tk Utk − Im
lim = lim = lim
k→∞ tk k→∞ tk k→∞ tk
tk
D tk − I
= lim Ut∗k Utk = log U0∗ D0 U0
k→∞ tk
−1/2 −1/2
= log Y0 = log X0 Ai X0 .
n −1/2 −1/2
This together with (4.11) yields 0 = i=1 wi log(X0 Ai X0 ). That is, X0 = Λ(ω; A). 2
Corollary 4.4. With P0 (ω; A) = Λ(ω; A), the map t → Pt (ω; A) is continuous on [−1, 1].
The basic properties of power means in Proposition 3.5 together with Theorem 4.3 provide
simple proofs of some important properties of the Karcher mean.
Corollary 4.5. (Cf. [24,12].) The Karcher mean satisfies the following properties:
Proof. By Proposition 3.5, Theorem 4.3 and by the Karcher equation, (P1)–(P13) are immediate.
For instance, since each Pt (ω; ·) is monotonic, its limit Λ(ω; ·) also is.
(P14) By Proposition 3.5(13), Φ(Pt (ω; A)) Pt (ω; Φ(A)) for all t ∈ (0, 1]. As t → 0, we
have Φ(Λ(ω; A)) Λ(ω; Φ(A)). If Φ is strictly positive, then Φ(Pt (ω; A)) Pt (ω; Φ(A)) for
all t ∈ [−1, 1) by Proposition 3.5(13). Then Φ(Λ(ω; A)) Λ(ω; Φ(A)).
1
(P15) By Proposition 3.5(11), |||Pt (ω; A)||| ( ni=1 wi |||Ai |||t ) t for all t ∈ (0, 1]. As t → 0,
1
we have |||Λ(ω; A)||| limt→0 ( ni=1 wi |||Ai |||t ) t = ni=1 |||Ai |||wi , where the equality follows
from the fact that weighted power means of positive real numbers converge to the weighted
geometric mean. The other inequality follows similarly. 2
Ando, Li and Mathias [2] listed the ten properties (P1)–(P10) for the unweighted case
ω = (1/n, . . . , 1/n) as properties that a geometric mean of n positive definite matrices should
satisfy, and their mean, called the ALM geometric mean, possesses all of them. The BMP
geometric mean of Bini, Meini and Poloni [15] is also a matrix geometric mean in this ax-
iomatic sense. In fact, there are infinitely many matrix geometric means: fixed point means of the
ALM and BMP geometric means [26] and their weighted geometric means ALM(A1 , . . . , An ) #t
BMP(A1 , . . . , An ), t ∈ [0, 1]. The properties (P11)–(P13) are special for the Karcher mean. Some
parts of the properties (P14) and (P15) have been established by Bhatia and Karandikar [12]. For
the weighted case, there are also infinitely many weighted geometric means of n positive def-
inite matrices: the weighted Karcher mean, the weighted BMP geometric mean [25] and their
weighted geometric means. We note that there has been no successful weighted extension of the
ALM geometric mean.
Next we investigate some other properties of the power mean that hold for the Karcher
mean.
Corollary 4.6. If A B and Ai < Bi for some i, then Λ(ω; A) < Λ(ω; B) and Pt (ω; A) <
Pt (ω; B) for any sufficiently small t.
< Λ(ω; B1 , . . . , Bi , . . . , Bn )
where we used the joint homogeneity and monotonicity of the Karcher mean. Finding 0 < β < 1
such that Λ(ω; A) < βΛ(ω; B) < Λ(ω; B), we have from Theorem 4.3 that
which implies that Pt (ω; A) < Pt (ω; B) for any sufficiently small t. 2
The continuity, indeed Lipschitz continuity, of the Karcher mean (P5) follows from d∞ (Λ(ω;
A), Λ(ω; B)) max{d∞ (Ai , Bi )}, which in turn follows from Proposition 3.5(5) and Theo-
1512 Y. Lim, M. Pálfia / Journal of Functional Analysis 262 (2012) 1498–1514
n
δ Λ(ω; A), Λ(ω; B) wi δ(Ai , Bi ). (4.12)
i=1
This nice inequality has been proved by Lawson and Lim [24] and Bhatia and Karandikar [12].
Corollary 4.7. If δ(Λ(ω; A), Λ(ω; B)) = ni=1 wi δ(Ai , Bi ), then for any sufficiently small t,
δ(Pt (ω; A), Pt (ω; B)) < ni=1 wi δ(Ai , Bi ).
n
lim δ Pt (ω; A), Pt (ω; B) = δ Λ(ω; A), Λ(ω; B) < wi δ(Ai , Bi ). 2
t→0
i=1
Property (P12) of the Karcher mean implies that Λ(ω̂; A1 , . . . , An−1 ) is the unique fixed point
of the map f (X) = Λ(ω; A1 , . . . , An−1 , X). By (4.12), f is a strict contraction on P with respect
to the Riemannian metric. The following (Löwer) order behavior around the fixed point of f is
special for the Karcher mean.
One can obtain in a way similar to the preceding that Λ(ω; A1 , . . . , An−1 , An ) An implies
Λ(ω̂; A1 , . . . , An−1 ) Λ(ω; A1 , . . . , An−1 , An ) and
n
(Y ) wi log Ai 0 implies Λ(ω; A1 , . . . , An ) I.
i=1
The property (Y ), which was established by Yamazaki [33], is one of characteristic properties of
the Karcher mean by the following result.
Theorem 4.9. The Karcher mean is uniquely determined by congruence invariancy (P6), self-
duality (P8), and (Y ).
Our method of deriving the monotonicity of the Karcher mean is free from any probabilistic
and Riemannian geometric techniques because we have just started from the Karcher equa-
tion (4.9). The Karcher equation can be defined on the convex cone of positive definite operators
on an infinite dimensional Hilbert space. But the existence and uniqueness of a positive definite
solution have not previously been investigated in any depth. The weighted power means exist
since the Thompson metric exists on the cone of positive definite operators, on which Lemma 2.2
and Lemma 2.3 are still valid [21,22]. So if one can show the monotonicity of the power mean
function t → Pt (ω; A), then the strong limit of the sequence Xk = P 1 (ω; A) exists and is prob-
k
ably a solution of the Karcher equation. Note that the power mean Pt (ω; A) is contained in the
order interval determined by the weighted harmonic and arithmetic means.
By a numerical simulation, the following result seems to be true. If 0 t s 1, then
Pt (ω; A) Ps (ω; A) for all ω ∈ n and A ∈ Pn . By (3.8), it is true for n = 2.
Changing the weighted arithmetic mean operation in the defining Eq. (3.4) of the power mean
Pt (ω; A) into any weighted geometric mean G(ω; A) of n positive definite matrices, which is
non-expansive for the Riemannian metric or the Thompson metric, yields other matrix geometric
means via the geometric mean equation
For instance, one may take G = BMP and G = ALM for the unweighted case (see [25] for their
non-expansiveness). One can check by the non-expansive property that a unique positive definite
solution exists, denoted by Gt (ω; A). By the self-duality of G, G−t (ω; A) := Gt (ω; A−1 )−1 =
Gt (ω; A). Then by using the fixed point approach in Proposition 3.5, one can see that Gt is a
weighted matrix mean (satisfies (P1)–(P10)) and is also non-expansive. By (P13), Λt = Λ for all
t ∈ (0, 1]. The general convergence of Gt (ω; A) as t → 0 and the monotonicity of t → Gt (ω; A)
are non-trivial and suggest interesting future work.
1514 Y. Lim, M. Pálfia / Journal of Functional Analysis 262 (2012) 1498–1514
Acknowledgments
The authors thank an anonymous referee for his/her insightful comments and suggestions.
This work was supported by the Korea Science and Engineering Foundation (KOSEF) grant
funded by the Korea government (MEST) (No. 2009-0070972).
References
[1] T. Ando, F. Hiai, Operator log-convex functions and operator means, Math. Ann. 350 (2011) 611–630.
[2] T. Ando, C.K. Li, R. Mathias, Geometric means, Linear Algebra Appl. 385 (2004) 305–334.
[3] V. Arsigny, P. Fillard, X. Pennec, N. Ayache, Geometric means in a novel vector space structure on symmetric
positive-definite matrices, SIAM J. Matrix Anal. Appl. 29 (2006) 328–347.
[4] A. Barachant, S. Bonnet, M. Congedo, C. Jutten, Riemannian geometry applied to BCI classification, preprint.
[5] F. Barbaresco, Interactions between symmetric cone and information geometries: Bruhat–Tits and Siegel spaces
models for higher resolution autoregressive doppler imagery, in: Lecture Notes in Comput. Sci., vol. 5416, 2009,
pp. 124–163.
[6] M. Berger, A Panoramic View of Riemannian Geometry, Springer-Verlag, 2003.
[7] A. Besenyei, D. Petz, Completely positive mappings and mean matrices, Linear Algebra Appl. 435 (2011) 984–997.
[8] R. Bhatia, Matrix Analysis, Springer-Verlag, New York, 1997.
[9] R. Bhatia, On the exponential metric increasing property, Linear Algebra Appl. 375 (2003) 211–220.
[10] R. Bhatia, Positive Definite Matrices, Princeton Ser. Appl. Math., Princeton University Press, Princeton, NJ, 2007.
[11] R. Bhatia, J. Holbrook, Riemannian geometry and matrix geometric means, Linear Algebra Appl. 413 (2006) 594–
618.
[12] R. Bhatia, R. Karandikar, Monotonicity of the matrix geometric mean, Math. Ann., doi:10.1007/s00208-011-
0721-9, in press.
[13] R. Bhatia, H. Kosaki, Mean matrices and infinite divisibility, Linear Algebra Appl. 424 (2007) 36–54.
[14] D. Bini, B. Iannazzo, Computing the Karcher mean of symmetric positive definite matrices, preprint.
[15] D. Bini, B. Meini, F. Poloni, An effective matrix geometric mean satisfying the Ando–Li–Mathias properties, Math.
Comp. 79 (2010) 437–452.
[16] G. Corach, H. Porta, L. Recht, Convexity of the geodesic distance on spaces of positive operators, Illinois J. Math. 38
(1994) 87–94.
[17] R. Horn, C. Johnson, Matrix Analysis, Cambridge University Press, 1985.
[18] H. Karcher, Riemannian center of mass and mollifier smoothing, Comm. Pure Appl. Math. 30 (1977) 509–541.
[19] F. Kubo, T. Ando, Means of positive linear operators, Math. Ann. 246 (1980) 205–224.
[20] S. Lang, Fundamentals of Differential Geometry, Grad. Texts in Math., Springer, 1999.
[21] J. Lawson, Y. Lim, Symmetric spaces with convex metrics, Forum Math. 19 (2007) 571–602.
[22] J. Lawson, Y. Lim, Metric convexity of symmetric cones, Osaka J. Math. 44 (2007) 795–816.
[23] J. Lawson, Y. Lim, A general framework for extending means to higher orders, Colloq. Math. 113 (2008) 191–221.
[24] J. Lawson, Y. Lim, Monotonic properties of the least squares mean, Math. Ann. 351 (2011) 267–279.
[25] H. Lee, Y. Lim, T. Yamazaki, Multi-variable weighted geometric means of positive definite matrices, Linear Algebra
Appl. 435 (2011) 307–322.
[26] Y. Lim, On Ando–Li–Mathias geometric mean equations, Linear Algebra Appl. 428 (2008) 1767–1777.
[27] J. Matharu, J. Aujla, Some inequalities for unitarily invariant norms, Linear Algebra Appl., doi:10.1016/
j.laa.2010.08.013, in press.
[28] M. Moakher, A differential geometric approach to the geometric mean of symmetric positive-definite matrices,
SIAM J. Matrix Anal. Appl. 26 (2005) 735–747.
[29] M. Moakher, On the averaging of symmetric positive-definite tensors, J. Elasticity 82 (2006) 273–296.
[30] K.-H. Neeb, Compressions of infinite-dimensional bounded symmetric domains, Semigroup Forum 61 (2001) 71–
105.
[31] M. Pálfia, Classification of affine matrix means, preprint, 2011.
[32] A.C. Thompson, On certain contraction mappings in a partially ordered vector space, Proc. Amer. Math. Soc. 14
(1963) 438–443.
[33] T. Yamazaki, The Riemannian mean and matrix inequalities related to the Ando–Hiai inequality and Chaotic order,
Oper. Matrices, in press.