Learning The Model From The Data
Learning The Model From The Data
net/publication/374115653
CITATIONS READS
0 40
2 authors:
All content following this page was uploaded by Carlos Cabrelli on 30 October 2023.
1. Introduction
In this note, we will provide an overview of recent developments in the field
of optimal subspaces, which has gained recently significant attention due to its
application in signal and image models. We refer the reader to the references for
more details and proofs.
The proliferation of available data has transformed the process of extracting
meaningful information from it. As each type of data possesses specific characteris-
tics, the design of tailored algorithms can take advantage of these shared attributes,
leading to improved efficiency.
Therefore, it is crucial to construct a model for each type of data that relies
on the fewest possible parameters while capturing their common features. One
potential approach to achieving this is by assuming certain hypotheses about the
141
142 CARLOS CABRELLI AND URSULA MOLTER
device or phenomenon that generated the data, such as assuming that the signals
under consideration are band-limited.
However, given the vast diversity of data available today, this approach may
not be suitable in many cases, particularly when considering for example, data as
internet traffic or stock market values. Instead, our strategy is to generate the
model from the data itself, using a set of subspaces as models, from which we can
choose the best fit for our data. The subspaces that we select and the data are all
from the same vector space.
In signal and image processing, there are often certain transformations that are
known to leave important features of the data, invariant. For example, in image
processing, translations, rotations, and scaling are common transformations that
preserve the spatial structure of an image.
To build effective models for such data, it is important to incorporate these
known invariances into the model. This can be done by explicitly including trans-
formation parameters and optimizing them along with the other parameters.
Incorporating invariances into the model can lead to more robust and accu-
rate performance on real-world data, as the model is better equipped to handle
variations and changes in the input data.
We will take into account subspaces that are invariant under both translations
and rotations. To simplify the model, we will only consider discrete sets of trans-
lations and rotations.
We want the subspaces in the class to be “small” in a sense that will be specified
in each case. This condition will be essential for the applications.
So the general scheme will be the following: Let H be a Hilbert space and M a
family of subspaces of H. Consider a finite set of data F = {f1 , . . . , fm } and define
m
X
E(F, S) = kfj − PS fj k2 , (1)
j=1
where S ∈ M and PS denote the orthogonal projection into the subspace S. The
functional E will be our gauge that will measure the fitness of the data to the
subspace. We analyze the existence and construction of an optimal subspace in the
class M that minimizes the functional E(F, S) over M.
Section 2 will focus on the case of a finite dimensional Hilbert space H and a
class M, which consists of all the subspaces of H with dimensions smaller than
a fixed positive integer `. Next, in Section 3, we will examine the prototypical
scenario of subspaces that are invariant under integer translations (SIS). We will
consider optimality for the subclass of SIS that exhibits additional invariance in
Section 4. Lastly, in Section 5, we will present the outcomes for models that are
invariant under translation and rotation.
next theorem is an adaptation of the Eckart–Young theorem ([12, 18]) and will be
used throughout the paper.
Given a set of vectors F = {f1 , . . . , fm } of a Hilbert space H define the Gramian
matrix of F by [GF ]i,j = hfi , fj iH , X = span {f1 , . . . , fm }, and let r = dim X =
rank GF .
With this notation we have:
Theorem 2.1 ([1, Theorem 4.1]). Let F = {f1 , . . . , fm } ⊆ H, where H is a Hilbert
space, and let n ≤ r be a positive integer. Let λ1 ≥ · · · ≥ λm ∈ R be the eigenvalues
of the matrix GF and y1 , . . . , ym ∈ Cm , with yi = (yi1 , . . . , yim )t the associated left
orthonormal eigenvectors. Define the vectors q1 , . . . , qn ∈ H by
m
X
qi = θi yij fj , i = 1, . . . , `,
j=1
−1/2
where θi = λi if λi 6= 0 and θi = 0 otherwise. Then {q1 , . . . , q` } is a Parseval
frame of W ∗ = span {q1 , . . . , q` } and the subspace W ∗ is optimal in the sense that,
if W is any subspace with dim(W ) ≤ `, we have
m
X m
X
E(F, W ∗ ) = kfi − PW ∗ fi k2 ≤ E(F, W) = kfi − PW fi k2 .
i=1 i=1
is the Paley–Wiener space of functions that are bandlimited to [−1/2, 1/2] defined
by
P W = {f ∈ L2 (R) : supp(fb) ⊆ [−1/2, 1/2]}.
It is easy to prove that for a measurable set Ω ⊆ Rd , the space
VΩ := {f ∈ L2 (Rd ) : supp(fb) ⊆ Ω} (2)
is translation invariant. Moreover, Wiener’s theorem (see [15]) proves that any
closed translation invariant subspace of L2 (Rd ) is of the form (2).
Note that if Φ is a set of generators of V , i.e. V = S(Φ), and V has extra
invariance M then
S(Φ) = span{Tk φ : φ ∈ Φ, k ∈ Zd } = span{Tα φ : φ ∈ Φ, α ∈ M }.
In [2] the authors characterize those shift invariant spaces V ⊆ L2 (R) that have
extra-invariance. They show that either V is translation invariant, or there exists
a maximum positive integer n such that V is n1 Z-invariant.
The d-dimensional case is considered in [3]. There, a characterization of the
extra invariance of V when M is not all Rd is obtained.
where Ω is a section of the quotient Rd /Zd . We refer the reader to [3] for more
details.
For each σ ∈ N , we consider F σ = {f1σ , . . . , fm
σ
} ⊆ L2 (Rd ) where, fjσ is such
that fcσ = fbj χB for j = 1, . . . , m.
j σ
where Uσ are unitary and Λσ (ω) := diag(λσ1 (ω), . . . , λσm (ω)) ∈ Cm×m and they are
also measurable matrices. We also have λσ1 (ω) ≥ · · · ≥ λσm (ω) for each σ ∈ N .
Using the decompositions of the blocks Gσ we have that
where U has blocks Uσ in the diagonal, and Λ is diagonal with blocks Λσ . We want
to recall here that for almost each ω the matrix Λ(ω) collects all the eigenvalues
of the Gramian GFe(ω) and the columns of the matrix U (ω) are the associated left
eigenvectors. Note that an eigenvector associated to the eigenvalue λσj (ω) has all
the components not corresponding to the block σ equal to zero.
Now for each fixed ω ∈ U, we consider {(i1 (ω), j1 (ω)), . . . , (in (ω), jn (ω))} with
is (ω) ∈ N and js (ω) ∈ {1, . . . , m} and n = mκ such that
i (ω) i (ω)
λj11 (ω) ≥ · · · ≥ λjnn (ω) ≥ 0
where θjiss (ω) = (λijss (ω))−1/2 if λijss (ω) 6= 0 and θjiss (ω) = 0 otherwise.
Now we are ready to state the main result of this section.
(2) The functions {h1 , . . . , h` } defined in (4) are in L2 (Rd ) and if we define
cj = hj , then Φ = {ϕ1 , . . . , ϕ` } is a generator set for the op-
ϕ1 , . . . , ϕ` by ϕ
timal subspace V ∗ and the set {ϕi (· − k), k ∈ Zd , i = 1, . . . , `} is a Parseval
frame for V ∗ .
We have the fundamental theorem of Bieberbach [7], [20] which states the fol-
lowing:
Remark 5.3.
• Note that the set Λ is not empty by Bierberach’s theorem [7] and consists of
translations on the lattice Λ which is isomorphic to Zd , and we will denote
by Tk for k ∈ Λ.
• The Point Group G of Γ is a finite subgroup of O(d), the orthogonal group
of Rd , that preserves the lattice of translations, i.e. GΛ = Λ. The simplest
examples are if G is a group of rotations, so we will abuse notation, and
denote the action of G on L2 (Rd ) by Rg for g ∈ G.
General results on crystal groups can be found for example in [14], [21], [17], [7],
and [8].
Note that the simplest example of a crystal group is the group of translations
on a lattice Λ, i.e. Γ = {Tk : k ∈ Λ}, where Tk (x) = x + k.
One very important class of crystal groups are the splitting crystal groups:
Every crystal group is naturally embedded in a splitting group, and very often
arguments for general groups can be relatively easy reduced to the splitting case
and then be proved for that simpler case. This justifies, that from now on Γ will
always be considered to be a splitting crystal group.
5.2. The structure of Γ-invariant spaces. Let us recall the structure of closed
subspaces of L2 (Rd ) that are invariant under the action of Γ = ΛoG, the semidirect
product of a uniform lattice Λ in Rd and a discrete and countable group G that
acts on Rd by continuous invertible automorphisms. We will assume that gΛ = Λ
for all g ∈ G, which implies that the Haar measure of Rd is invariant under the
action of G.
A closed subspace V ⊆ L2 (Rd ) is Γ-invariant if Tk Rg V ⊆ V for all (k, g) ∈
Γ. Here for f ∈ V , Tk f (x) = f (x − k), k ∈ Λ and Rg f (x) = f (g −1 x), g ∈ G.
Equivalently, V is Γ-invariant if
f ∈ V ⇒ Tk f ∈ V ∀ k ∈ Λ and Rg f ∈ V ∀ g ∈ G.
Lemma 5.7 ([5, Lemma 5.1]). Let F g be the family {R(g)fi : (i, g) ∈ Im × G} ⊆
L2 (Rd ) ordered with the lexicographical ordering of Im × G := {1, 2, . . . , m} × G,
and let GF g be its Grammian as before.
1. For ω ∈ Ω, let {σi,g (ω)2 : (i, g) ∈ Im × G} be the eigenvalues of G(ω)
ordered decreasingly with the lexicographical ordering of Im × G, counted
with their multiplicity. Then they are G-invariant, in the sense that
Theorem 5.8 ([5, Theorem 5.2]). Let F = {f1 , . . . , fm } be a set of functional data
in L2 (Rd ). Using the same notations as in Lemma 5.7, the following holds:
1. For all κ ∈ {1, . . . , m} there exists a Γ-invariant space W ⊆ L2 (Rd ) gen-
erated by Γ-orbits of a family {ψi }κi=1 ⊆ L2 (Rd ) such that
where
0 X
Cij,g (ω) = i,g
θi,g (ω)Vj,g 0 (ω)χ ∗
g Ω0
(ω) , i = 1, . . . , κ
g∈G
and θi,g (ω) = (σi,g (ω))−1 if σi,g (ω) 6= 0 and 0 otherwise. All identities
hold for a.e. ω ∈ Ω.
References
[1] A. Aldroubi, C. Cabrelli, D. Hardin, and U. Molter, Optimal shift invariant spaces and
their Parseval frame generators, Appl. Comput. Harmon. Anal. 23 no. 2 (2007), 273–283.
DOI MR Zbl
[2] A. Aldroubi, C. Cabrelli, C. Heil, K. Kornelson, and U. Molter, Invariance of a shift-
invariant space, J. Fourier Anal. Appl. 16 no. 1 (2010), 60–75. DOI MR Zbl
[3] M. Anastasio, C. Cabrelli, and V. Paternostro, Invariance of a shift-invariant space in
several variables, Complex Anal. Oper. Theory 5 no. 4 (2011), 1031–1050. DOI MR Zbl
[4] D. Barbieri, C. Cabrelli, E. Hernández, and U. Molter, Optimal translational-rotational
invariant dictionaries for images, in Proc. SPIE 11138, Wavelets and Sparsity XVIII, 2019.
DOI
[5] D. Barbieri, C. Cabrelli, E. Hernández, and U. Molter, Approximation by group in-
variant subspaces, J. Math. Pures Appl. (9) 142 (2020), 76–100. DOI MR Zbl
[6] D. Barbieri, C. Cabrelli, E. Hernández, and U. Molter, Data approximation with time-
frequency invariant systems, in Landscapes of Time-Frequency Analysis—ATFA 2019, Appl.
Numer. Harmon. Anal., Birkhäuser/Springer, Cham, 2020, pp. 29–42. DOI MR Zbl
[7] L. Bieberbach, Über die Bewegungsgruppen der Euklidischen Räume. (Erste Abh.), Math.
Ann. 70 no. 3 (1911), 297–336. DOI MR Zbl
[8] L. Bieberbach, Über die Bewegungsgruppen der Euklidischen Räume. (Zweite Abh.), Math.
Ann. 72 no. 3 (1912), 400–412. DOI MR Zbl
[9] C. de Boor, R. A. DeVore, and A. Ron, The structure of finitely generated shift-invariant
spaces in L2 (Rd ), J. Funct. Anal. 119 no. 1 (1994), 37–78. DOI MR Zbl
[10] C. Cabrelli and C. A. Mosquera, Subspaces with extra invariance nearest to observed
data, Appl. Comput. Harmon. Anal. 41 no. 2 (2016), 660–676. DOI MR Zbl
[11] C. Cabrelli, C. A. Mosquera, and V. Paternostro, An approximation problem in multi-
plicatively invariant spaces, in Functional Analysis, Harmonic Analysis, and Image Process-
ing: A Collection of Papers in Honor of Björn Jawerth, Contemp. Math. 693, Amer. Math.
Soc., Providence, RI, 2017, pp. 143–165. DOI MR Zbl
[12] C. Eckart and G. Young, The approximation of one matrix by another of lower rank,
Psychometrika 1 (1936), 211–218. DOI Zbl
[13] D. R. Farkas, Crystallographic groups and their mathematics, Rocky Mountain J. Math.
11 no. 4 (1981), 511–551. DOI MR Zbl
[14] B. Grünbaum and G. C. Shephard, Tilings and Patterns, W. H. Freeman, New York, 1987.
MR Zbl
[15] H. Helson, Lectures on Invariant Subspaces, Academic Press, New York, 1964. MR Zbl
[16] J. S. Lomont, Applications of Finite Groups, Dover, New York, 1993. MR Zbl
[17] G. E. Martin, Transformation Geometry: An Introduction to Symmetry, Undergraduate
Texts in Mathematics, Springer-Verlag, New York-Berlin, 1982. MR Zbl
[18] E. Schmidt, Zur Theorie der linearen und nicht linearen Integralgleichungen. Zweite Ab-
handlung: Auflösung der allgemeinen linearen Integralgleichung, Math. Ann. 64 no. 2 (1907),
161–174. DOI MR Zbl
[19] R. Tessera and H. Wang, Uncertainty principles in finitely generated shift-invariant spaces
with additional invariance, J. Math. Anal. Appl. 410 no. 1 (2014), 134–143. DOI MR Zbl
[20] J. A. Wolf, Spaces of Constant Curvature, McGraw-Hill, New York, 1967. MR Zbl
[21] H. Zassenhaus, Beweis eines Satzes über diskrete Gruppen, Abh. Math. Sem. Univ. Hamburg
12 no. 1 (1938), 289–312. DOI MR Zbl