Theory and Computation of Tensors: Multi-Dimensional Arrays 1st Edition Yimin Wei 2024 Scribd Download
Theory and Computation of Tensors: Multi-Dimensional Arrays 1st Edition Yimin Wei 2024 Scribd Download
com
https://textbookfull.com/product/theory-and-
computation-of-tensors-multi-dimensional-
arrays-1st-edition-yimin-wei/
https://textbookfull.com/product/multi-dimensional-approaches-towards-
new-technology-ashish-bharadwaj/
textbookfull.com
https://textbookfull.com/product/compilation-for-secure-multi-party-
computation-1st-edition-niklas-buscher/
textbookfull.com
https://textbookfull.com/product/drugs-in-american-society-erich-
goode/
textbookfull.com
Ceramics and Society A Technological Approach to
Archaeological Assemblages Valentine Roux
https://textbookfull.com/product/ceramics-and-society-a-technological-
approach-to-archaeological-assemblages-valentine-roux/
textbookfull.com
https://textbookfull.com/product/fundamentals-of-magnonics-sergio-m-
rezende/
textbookfull.com
https://textbookfull.com/product/case-studies-in-immunology-a-
clinical-companion-raif-geha/
textbookfull.com
https://textbookfull.com/product/from-smart-city-to-smart-region-
digital-services-for-an-internet-of-places-1st-edition-corinna-
morandi/
textbookfull.com
Weapons of Math Destruction Cathy O’Neil
https://textbookfull.com/product/weapons-of-math-destruction-cathy-
oneil/
textbookfull.com
Theory and
Computation
of Tensors
Theory and
Computation
of Tensors
Multi-Dimensional Arrays
WEIYANG DING
YIMIN WEI
No part of this publication may be reproduced or transmitted in any form or by any means, electronic
or mechanical, including photocopying, recording, or any information storage and retrieval system,
without permission in writing from the publisher. Details on how to seek permission, further
information about the Publisher’s permissions policies and our arrangements with organizations such as
the Copyright Clearance Center and the Copyright Licensing Agency, can be found at our website:
www.elsevier.com/permissions.
This book and the individual contributions contained in it are protected under copyright by the
Publisher (other than as may be noted herein).
Notices
Knowledge and best practice in this field are constantly changing. As new research and experience
broaden our understanding, changes in research methods, professional practices, or medical treatment
may become necessary.
Practitioners and researchers must always rely on their own experience and knowledge in evaluating
and using any information, methods, compounds, or experiments described herein. In using such
information or methods they should be mindful of their own safety and the safety of others, including
parties for whom they have a professional responsibility.
To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors, assume
any liability for any injury and/or damage to persons or property as a matter of products liability,
negligence or otherwise, or from any use or operation of any methods, products, instructions, or ideas
contained in the material herein.
Preface v
I General Theory 1
II Hankel Tensors 37
4 Inheritance Properties 59
4.1 Inheritance Properties 59
4.2 The First luhcl'itance Property of Haukel Tensors. 61
4.2.1 A COllvolution Formula . 61
4.2.2 Lower-Order Implies Higher-Order 63
4.2.3 SOS Dccomposit.ioll of Strong Hankel Tensors. 65
11.3 The Second Inheritance Property of Hankel Tensors 66
4.3.1 Strong Hankel Tensors . . 66
4.3.2 A General Vandermolldc Decomposition of Hankel i\latrices 68
4.3.3 An Augmcnted Vandcnnonde Dccomposition of Hankel
Tensors 71
4.3.4 The Second Inherit.ance Propert.y of Hankel Tensors 75
4.4 The Third Inheritance Property of Hankel Tensors 77
III M-Tensors 79
Bibliography 125
Introduction and
Preliminaries
We first introduce the concepts and sources of tensors in this chapter. Several
essential and frequently used operations involving tensors are also included. Fur-
thermore, two basic topics, tensor decompositions and tensor eigenvalue prob-
lems, are briefly discussed at the end of this chapter.
Mid Final
Sub. 1 Sub. 2 Sub. 3 Sub. 1 Sub. 2 Sub. 3
Std. 1 s111 s121 s131 s112 s122 s132
Std. 2 s211 s221 s231 s212 s222 s232
Std. 3 s311 s321 s331 s312 s322 s332
Std. 4 s411 s421 s431 s412 s422 s432
Example 1.2. Another important realization of tensors are the storage of color
images and videos. A black-and-white image can be stored as a greyscale matrix,
whose entries are the greyscale values of the corresponding pixels. Color images
are often built from several stacked color channels, each of which represents
value levels of the given channel. For example, RGB images are composed of
three independent channels for red, green, and blue primary color components.
We can apply a 3rd -order tensor P to store an RGB image, whose (i, j, k) entry
denotes the value of the k-th channel in the (i, j) position. (k = 1, 2, 3 represent
the red, green, and blue channel, respectively.) In order to store a color video,
we may need an extra index for the time axis. That is, we employ a 4th -order
tensor M = (mijkt ), where M(:, :, :, t) stores the t-th frame of the video as a
color image.
Example 1.3. Denote x = (x1 , x2 , . . . , xn )> ∈ Rn . As we know, a degree-1
polynomial p1 (x) = c1 x1 + c2 x2 + · · · + cn xn can be rewritten into p1 (x) = x> c,
where the vector c = (c1 , c2 , . . . , cn )> . Similarly, a degree-2 polynomial p2 (x) =
P n >
i,j=1 cij xi xj , that is, a quadratic form, can be simplified into p2 (x) = x Cx,
th
where the matrix C = (cij ). By analogy, if we denote an m -order tensor
C = (ci1 i2 ...im ) and apply a notation, which will be introduced in the next
section, then the degree-m homogeneous polynomial
n X
X n n
X
pm (x) = ··· ci1 i2 ...im xi1 xi2 . . . xim
i1 =1 i2 =1 im =1
can be rewritten as
pm (x) = Cxm .
Moreover, x> c = 0 is often used to denote a hyperplane in Rn . Similarly,
Cxm = 0 can stand for an degree-m hypersurface in Rn . We shall see in Section
1.2 that the normal vector at a point x0 on this hypersurface is nx0 = Cxm−1
0 .
Visit https://textbookfull.com
now to explore a rich
collection of eBooks, textbook
and enjoy exciting offers!
1.1. WHAT ARE TENSORS? 5
for all i, j = 1, 2, . . . , n and denote (xt )i = Pr(Yt = i), that is, Xt = xt x>
t , then
the stationary probability distribution x satisfies x = Px2 [74]. We shall see in
Section 1.2 that x is a special eigenvector of the tensor P.
From the above examples, we can gain some basic ideas about what tensors
are and where they come from. Generally speaking, there are two kinds of
tensors: the first kind is a data structure, which admits different dimensions
according to the complexity of the data; the second kind is an operator, where
it possesses different meanings in different situations.
us. The vectorization operator vec(·) turns tensors into column vectors.
Take a 2 × 2 × 2 tensor A = (aijk )2i,j,k=1 for example, then
vec(A) = (a111 , a211 , a121 , a221 , a112 , a212 , a122 , a222 )> .
There are a lot of different ways to reshape tensors into matrices, which
are often referred to as “unfoldings.” The most frequently applied one
is call the modal unfolding. The mode-k unfolding A(k) of an mth -order
tensor A of size n1 × n2 × · · · × nm is an nk -by-(N/nk ) matrix, where
N = n1 n2 . . . nm . Again, use the above 2 × 2 × 2 example. Its mode-1,
mode-2, and mode-3 unfoldings are
a111 a121 a112 a122
A(1) = ,
a211 a221 a212 a222
a111 a211 a112 a212
A(2) = ,
a121 a221 a122 a222
a111 a211 a121 a221
A(3) = ,
a112 a212 a122 a222
respectively. Sometimes the mode-k unfolding is also denoted as Unfoldk (·).
• The transposition operation of a matrix is understood as the exchange
of the two indices. But higher-order tensors have more indices, thus we
have much more transpositions of tensors. If A is a 3rd -order tensor,
then there are six possible
transpositions denoted as A<[σ(1),σ(2),σ(3)]> ,
where σ(1), σ(2), σ(3) is any of the six permutations of (1, 2, 3). When
B = A<[σ(1),σ(2),σ(3)]> , it means
If all the entries of a tensor are invariant under any permutations of the
indices, then we call it a symmetric tensor. For example, a 3rd -order tensor
is said to be symmetric if and only if
3. A ×k (α1 M1 + α2 M2 ) = α1 A ×k M1 + α1 A ×k M1 ,
4. Unfoldk (A ×k M ) = M > A(k) ,
5. vec(A ×1 M1 ×2 M2 · · · ×m Mm ) = (Mm ⊗ · · · ⊗ M2 ⊗ M1 )> vec(A),
where A is a tensor, Mk are matrices, and α1 , α2 are scalars.
• If the matrices degrade into column vectors, then we obtain another cluster
of important notations for tensor spectral theory. Let A be an mth -order
n-dimensional tensor, that is, of size n × n × · · · × n, and x be a vector of
length n, then for simplicity:
Axm = A ×1 x ×2 x ×3 x · · · ×m x is a scalar,
Axm−1 = A ×2 x ×3 x · · · ×m x is a vector,
m−2
Ax =A ×3 x · · · ×m x is a matrix.
• Like the vector case, an inner product of two tensors A and B of the same
size are defined by
n1
X nm
X
hA, Bi = ··· ai1 i2 ...im · bi1 i2 ...im ,
i1 =1 im =1
which is exactly the usual inner product of the two vectors vec(A) and
vec(B).
• The outer product of two tensors is a higher-order tensor. Let A and
B be mth -order and (m0 )th -order tensors, respectively. Then their outer
product A ◦ B is an (m + m0 )th -order tensor with
(A ◦ B)i1 ...im j1 ...jm0 = ai1 i2 ...im · bj1 j2 ...jm0 .
If a and b are vectors, then a ◦ b = ab> .
• We sometimes refer to the Hadamard product of two tensors with the
same size as
⊗ B)i1 i2 ...im = ai1 i2 ...im · bi1 i2 ...im .
(AHAD
The Hadamard product will also be denoted as A. ∗ B in the descriptions
of some algorithms, which is a MATLAB-type notation.
where U ∈ Rm×r and V ∈ Rn×r are column orthogonal matrices, that is,
U > U = Im and V > V = In , and Σ = diag(σ1 , σ2 , . . . , σr ) is a positive diagonal
matrix. Then σk are called the singular values of A. If U = [u1 , u2 , . . . , ur ] and
V = [v1 , v2 , . . . , vr ], then the SVD can be rewritten asinto
r
X
A= σi ui vi> ,
i=1
where r might be larger than n. Each term u1i ◦ u2i ◦ · · · ◦ umi in the CP
decomposition is called a rank-one tensor. The least number of the terms, that
is,
n Xr o
R = min r : A = σi u1i ◦ u2i ◦ · · · ◦ umi ,
i=1
10 CHAPTER 1. INTRODUCTION AND PRELIMINARIES
for all i1 , i2 , . . . , im , where G1 and Gm are matrices and G2 , . . . , Gm−1 are 3rd -
order tensors. If r1 , r2 , . . . , rm−1 are much smaller than the tensor size, then
the TT representation will greatly reduce the storage cost for the tensor.
This was the original motivation when Qi [98] and Lim [76] first introduced this
kind of eigenvalue problem.
We also have other nonhomogeneous tensor eigenvalue definitions. For ex-
ample, if a scalar λ ∈ C and a nonzero vector x ∈ Cn satisfy
Axm−1 = λx with x> x = 1,
then λ is called an E-eigenvalue of A and x is called a corresponding E-
eigenvector [99]. Furthermore, when the tensor A, the scalar λ, and the vector x
are all real, we call λ an Z-eigenvalue of A and x a corresponding Z-eigenvector.
For a real symmetric tensor A, its Z-eigenvectors are exactly the KKT points
of the polynomial optimization problem
max/min Axm ,
s.t. x21 + x22 + · · · + x2n = 1.
We will introduce more definitions of tensor eigenvalues in Chapter 2, which
can be unified into a generalized tensor eigenvalue problem Axm−1 = λBxm−1 .
Further discussions will be conducted on this unified framework.
Chapter 2
Generalized Tensor
Eigenvalue Problems
Axm−1 = λBxm−1 .
In Cui, Dai, and Nie’s paper [30], they impose the constraint λk+1 < λk − δ
when the k-th eigenvalue is obtained, so that the (k + 1)-th eigenvalue can also
be computed by the same method as the previous ones. Chen, Han, and Zhou
[25] proposed homotopy methods for generalized tensor eigenvalue problems.
It was pointed out in [19, 20, 30, 68] that the generalized eigenvalue frame-
work unifies several definitions of tensor eigenvalues, such as the eigenvalues and
S̄xm−1 = λx̄,
with kxk2 = 1.
S x̄m−1 = λx,
S(1
e : n, . . . , 1 : n) = S̄ and S(n
e + 1 : 2n, . . . , n + 1 : 2n) = S,
max Lxm ,
s.t. Lp xm = 1,
2.2. BASIC DEFINITIONS 13
then we call (α, β) an eigenvalue of the regular tensor pair {A, B} and x the
corresponding eigenvector. When B is nonsingular, that is, det(B) 6= 0, no
nonzero vector x ∈ Cn satisfies that Bxm−1 = 0, according to [57, Theorem
3.1]. Thus β 6= 0 if (α, β) is an eigenvalue of {A, B}. We also call λ = α/β ∈ C
an eigenvalue of the tensor pair {A, B} when det(B) 6= 0. Denote the spectrum,
that is, the set of all the eigenvalues, of {A, B} as
λ(A, B) = (α, β) ∈ C1,2 : det(βA − αB) = 0 ,
Since det(B)
e 6= 0, then the eigenvalues of {A,
e B}
e are exactly the complex roots
of the polynomial equation det(A − λB) = 0. By [57, Proposition 2.4], the
e e
Visit https://textbookfull.com
now to explore a rich
collection of eBooks, textbook
and enjoy exciting offers!
2.3. SEVERAL BASIC PROPERTIES 15
θ kAxm−1 k, kBxm−1 k .
ψθ (A, B) := max
x∈Cn \{0}
It is apparent that ρθ (A, B) ≤ ψθ (A, B), since θ kAxm−1 k, kBxm−1 k = θ(|α|, |β|),
if (α, β) ∈ λ(A, B) and x is the corresponding eigenvector.
When det(B) 6= 0, we can define the spectral radius in a simpler way
kAxm−1 k
ψ(A, B) := max .
x∈Cn \{0} kBxm−1 k
It is easy to verify that tan ρθ (A, B) = ρ(A, B) ≤ ψ(A, B) = tan ψθ (A, B).
Furthermore, if B is fixed, then the nonnegative function ψ(·, B) is a seminorm
on Cn×n×···×n . When A, B are matrices and B = I, the result becomes the
familiar one that the spectral radius of a matrix is always no larger than its
norm.
diagonal if all its entries except the diagonal ones aii...i (i = 1, 2, . . . , n) are
zeros. We call {A, B} a diagonalizable tensor pair if there are two nonsingular
matrices P and Q such that
are two diagonal tensors. Let the diagonal entries of C and D be {c1 , c2 , . . . , cn }
and {d1 , d2 , . . . , dn }, respectively. If (ci , di ) 6= (0, 0) for all i = 1, 2, . . . , n,
then {A, B} is a regular tensor pair. Furthermore, (ci , di ) are exactly all the
eigenvalues of {A, B}, and their multiplicities are (m − 1)n−1 .
It should be pointed out that the concept of “diagonalizable tensor pair”
is not so general as the concept of “diagonalizable matrix pair” [113, Section
6.2.3]. For instance, if matrices A and B are both symmetric, then the matrix
pair {A, B} must be diagonalizable. However, this is not true for symmetric
tensor pairs. Hence we shall present a nontrivial diagonalizable tensor pair to
illustrate that it is a reasonable definition.
Example 2.1. Let A and B be two mth -order n-dimensional anti-circulant
tensors [39]. According to the result in that paper, we know that
are both diagonal, where Fn is the n-by-n Fourier matrix [49, Section 1.4].
Therefore an anti-circulant tensor pair must be diagonalizable.
kAkp := kA(1) kp .
Notice that for a positive integer k, we have kx[k] k∞ = kx⊗k k∞ , where x[k] =
[xk1 , xk2 , . . . , xkn ]> is the componentwise power and x⊗k = x ⊗ x ⊗ · · · ⊗ x is the
Kronecker product of k copies of x [49, Section 12.3]. Denote M A := A ×1 M >
and AM m−1 = A×2 M · · ·×m M for simplicity. Then we can prove the following
lemma for ∞-norm:
Lemma 2.2. Let {A, B} and {C, D} = {CI, DI} be two mth -order n-dimensional
regular tensor pairs, where C and D are two matrices. If (α, β) ∈ C1,2 is an
eigenvalue of {A, B}, then either det(βC − αD) = 0 or
Proof. If (α, β) ∈ C1,2 satisfies that det(βC − αD) 6= 0, then it holds for an
arbitrary nonzero vector y ∈ Cn that (βC − αD)ym−1 6= 0, which is equivalent
to (βC − αD)y[m−1] 6= 0. Then the matrix βC − αD is nonsingular.
Since (α, β) is an eigenvalue of {A, B}, there exists a nonzero vector x ∈ Cn
such that βAxm−1 = αBxm−1 . This indicates that
Finally, we obtain the Gershgorin circle theorem for regular tensor pairs.
Theorem 2.4. Let {A, B} be an mth -order n-dimensional regular tensor pair.
Suppose that (aii...i , bii...i ) 6= (0, 0) for all i = 1, 2, . . . , n. Denote the disks
n o
Gi (A, B) := (α, β) ∈ C1,2 : chord (α, β), (aii...i , bii...i ) ≤ γi
for i = 1, 2, . . . , n, where
v
u 2 2
u P P
(i2 ,i3 ,...,im ) |aii2 ...im | + (i2 ,i3 ,...,im ) |bii2 ...im |
u
u
6=(i,i,...,i) 6=(i,i,...,i)
γi = .
t
|aii...i |2 + |bii...i |2
Sn
Then λ(A, B) ⊆ i=1 Gi (A, B).
If B is taken as the unit tensor I, then Theorem 2.4 reduces to the Gershgorin
circle theorem for single tensors, that is, [98, Theorem 6(a)].
Furthermore, the Gershgorin circle theorem can have a tighter version when
the order of the tensor pair is no less than 3. A tensor is called semi-symmetric
[86] if its entries are invariable under any permutations of the last (m − 1)
indices. Then for an arbitrary tensor A, there is a semi-symmetric tensor Ae
such that Axm−1 = Ax e m−1 for all x ∈ Cn . Concretely, the entries of Ae are
1 X
aii2 ...im = aii02 i03 ...i0m ,
|π(i2 , . . . , im )| 0 0
e
0
(i2 ,i3 ,...,im )∈
π(i2 ,i3 ,...,im )
Hence we have a tight version of the Gershgorin circle theorem for regular tensor
pairs, that is, the disk Gei (A, B) in the following theorem must be no larger than
the disk Gi (A, B) in Theorem 2.4.
2.3. SEVERAL BASIC PROPERTIES 19
Theorem 2.5. Let {A, B} be an mth -order n-dimensional regular tensor pair.
Assume that (aii...i , bii...i ) 6= (0, 0) for all i = 1, 2, . . . , n. Let Ae and B e be the
m−1 m−1 m−1
semi-symmetric tensors such that Ax = Ax
e and Bx = Bxm−1 for
e
n
all x ∈ C . Denote the disks
n o
Gei (A, B) := (α, β) ∈ C1,2 : chord (α, β), (aii...i , bii...i ) ≤ γ
ei
for i = 1, 2, . . . , n, where
v
u 2 2
u P P
(i2 ,i3 ,...,im ) |e
aii2 ...im | + (i2 ,i3 ,...,im ) |bii2 ...im |
u e
u
6=(i,i,...,i) 6=(i,i,...,i)
γ
ei = .
t
|aii...i |2 + |bii...i |2
Sn
Then λ(A, B) ⊆ i=1 Gei (A, B).
b ⊗(m−1) †
δ βx
r 1 ⊗(m−1) ,
E(1) /δ1 , −F(1) /δ2 = b
δ2 α
bx
II.
III.
I.