0% found this document useful (0 votes)
0 views

Graph Signal Processing History Development Impact and Outlook

Uploaded by

Rani Garginal
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
0 views

Graph Signal Processing History Development Impact and Outlook

Uploaded by

Rani Garginal
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

75TH ANNIVERSARY OF SIGNAL PROCESSING

SOCIETY SPECIAL ISSUE

Geert Leus , Antonio G. Marques , José M.F. Moura ,


Antonio Ortega , and David I Shuman

Graph Signal Processing


History, development, impact, and outlook

S
ignal processing (SP) excels at analyzing, processing, and
inferring information defined over regular (first continu-
ous, later discrete) domains such as time or space. Indeed,
the last 75 years have shown how SP has made an impact in
areas such as communications, acoustics, sensing, image
processing, and control, to name a few. With the digitaliza-
tion of the modern world and the increasing pervasiveness of
data-collection mechanisms, information of interest in current
applications oftentimes arises in non-Euclidean, irregular do-
mains. Graph SP (GSP) generalizes SP tasks to signals living
on non-Euclidean domains whose structure can be captured by
a weighted graph. Graphs are versatile, able to model irregu-
lar interactions, easy to interpret, and endowed with a corpus
of mathematical results, rendering them natural candidates to
serve as the basis for a theory of processing signals in more
irregular domains.
The term graph signal processing was coined a decade ago
in the seminal works of [1], [2], [3], and [4]. Since these papers
were published, GSP-related problems have drawn significant
attention, not only within the SP community [5] but also in
machine learning (ML) venues, where research in graph-based
learning has increased significantly [6]. Graph signals are well-
suited to model measurements/information/data associated
with (indexed by) a set where 1) the elements of the set belong
©SHUTTERSTOCK.COM/TRIFF
to the same class (regions of the cerebral cortex, members of
a social network, weather stations across a continent); 2) there
exists a relation (physical or functional) of proximity, influence,
or association among the different elements of that set; and 3)
the strength of such a relation among the pairs of elements is
not homogeneous. In some scenarios, the supporting graph is
a physical, technological, social, information, or biological net-
work where the links can be explicitly observed. In many other
cases, the graph is implicit, capturing some notion of depen-
dence or similarity across nodes, and the links must be inferred
from the data themselves. As a result, GSP is a broad frame-
work that encompasses and extends classical SP methods, tools,
and algorithms to application domains of the modern techno-
Digital Object Identifier 10.1109/MSP.2023.3262906
Date of current version: 1 June 2023 logical world, including social, transportation, c­ ommunication,

d licensed use1053-5888/23©2023IEEE IEEE SIGNAL PROCESSING


limited to: INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR. MAGAZINE
Downloaded | June 2023
on November | at 11:59:24 UTC from IEEE Xplore. Restrictions apply.
07,2024 49
and brain networks; recommender systems; financial engineer- the graph edges capturing their probabilistic dependencies.
ing; distributed control; and learning. Although the theory Through the graph, these models sparsely encode complex
and application domains of GSP continue to expand, GSP has probability distributions in high-dimensional spaces. Graphical
become a technology with wide use. It is a research domain ­models have been widely used in Bayesian statistics and Bayes-
pursued by a broad community, the subject of not only many ian probabilistic approaches, kernel regression methods, statis-
journal and conference articles, but also of textbooks [5], spe- tical learning, and statistical mechanics [17]. We return to SSL
cial issues of different journals, symposia, workshops, and spe- and neuroscience and their connections with GSP in the “SSL”
cial sessions at ICASSP and other SP conferences. and “Applications to Neuroscience” sections, respectively.
In this article, we provide an overview of the evolution of Also in the late 1990s, two new models were introduced
GSP, from its origins to the challenges ahead. The first half is for random networks (graphs) to model the structure of com-
devoted to reviewing the history of GSP and explaining how plex engineered systems, going well beyond the classical
it gave rise to an encompassing framework that shares mul- Erdös–Rényi random graphs: real-world large networked sys-
tiple similarities with SP, and especially digital SP (DSP). A tems exhibit small-world characteristics (the Watts–Strogatz
key message is that GSP has been critical to develop novel and model) and scale-free degree distributions (the Barabási–
technically sound tools, theory, and algorithms that, by leverag- Albert model). This led to a flurry of activity, usually referred
ing analogies with and the insights of DSP, provide new ways to as network science, concerned with analyzing and design-
to analyze, process, and learn from graph signals. In the second ing complex systems like telecommunication, power grid, and
half, we shift focus to review the impact of GSP on other dis- large-scale infrastructure networks [18]. Although the central
ciplines. First, we look at the use of GSP in data science prob- focus of network science was on properties of the network and
lems, including graph learning and graph-based deep learning. its nodes (e.g., centralities, shortest paths, and clustering coef-
Second, we discuss the impact of GSP on applications, includ- ficients), network science researchers also leveraged graphs to
ing neuroscience and image and video processing. We finally explore the dynamics of processes such as percolation, traf-
conclude with a brief discussion of the emerging and future fic flows, synchronization, and epidemic spread [18, Part 5],
directions of GSP. often adopting mean field approximations. For example, in the
investigation of the susceptible-infected-susceptible epidemio-
The early roots logical model in scale-free graphs in [19], each vertex can be
The roots of GSP can be traced to algebraic and spectral graph seen as having a 0/1 (susceptible/infected) signal residing on it.
theory, harmonic analysis, numerical linear algebra, and spe- Advancements in network science have certainly informed the
cific applications of these ideas to areas such as data represen- subsequent development of GSP.
tations for high-dimensional data, pattern recognition, (fast) In parallel, a stream of new methods for analyzing data on
transforms, image processing, computer graphics, statistical graphs were investigated. These methods tried specifically to
physics, partial differential equations, semisupervised learn- combine 1) intuition and dictionary constructions for perform-
ing (SSL), and neuroscience. Algebraic graph theory [7] dates ing computational harmonic analysis on data on Euclidean
back to the 1700s, and spectral graph theory [8] dates back to domains with 2) generalizable ways to incorporate the structure
the mid-1900s. They study mathematical properties of graphs of the underlying graph into the data transforms. For example,
and link the graph structure to the spectrum (eigenvalues and one of the first general wavelet constructions for signals on
eigenvectors) of matrices related to the graph. However, they graphs was the spatial wavelet transform of [20], which was
generally did not consider potential signals that could be liv- defined directly in the vertex domain. In the seminal work of
ing on the graph. Crovella and Kolaczyck [21], diffusion wavelets were construct-
In the late 1990s and early 2000s, graph-based methods for ed by 1) creating a multiresolution of approximation spaces,
analyzing and processing data became more popular, indepen- each spanned by graph signals generated by diffusing a unit of
dently, in a number of disciplines, including computer graphics energy outwards from each vertex for a fixed amount of time,
[9], image processing [10], graphical models in Bayesian statis- and 2) computing orthogonal diffusion wavelets to serve as basis
tics [11], [12], dimensionality reduction [13], SSL [14], and neuro- functions for the detail spaces that are the sequential orthogo-
science (e.g., the detailed history included in [15]). For example, nal complements of the approximation spaces. Spectral graph
in computer graphics, Taubin utilized graph Laplacian eigen- wavelets [1] traded off the orthogonality of diffusion wavelets
vectors to perform surface smoothing by applying a low-pass for a simpler generative method for each wavelet atom: define
graph filter to functions defined on polyhedral surfaces [9], and a pattern in the graph spectral domain and localize that pattern
later used similar ideas to compress polygonal meshes. In image to be centered at each vertex of the graph. Meanwhile, the alge-
processing, weighted graphs can be defined with edges being a braic SP approach [22], [23] showed that classical SP can be
function of pixel distance and intensity differences. Such semilo- captured by a triplet defined by a shift operator. Different shifts
cal and nonlocal graphs were exploited for denoising (bilateral lead to different SP models and different Fourier transforms.
filtering), image smoothing, and image segmentation (e.g., in In particular, it showed that a shift based on Chebyshev poly-
[10] and [16]). Graphical models [12]— in particular, undirected nomials, appropriate for lattice models like in images, leads to
graphical models, also referred to as Markov random fields— standard block transforms such as the discrete cosine transform
model data as a family of random variables (the vertices), with (DCT) and K ­ arhunen–Loève transform (KLT), which can be

50 limited to: INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR.


d licensed use IEEE SIGNAL PROCESSING MAGAZINE
Downloaded | June 2023
on November | at 11:59:24 UTC from IEEE Xplore. Restrictions apply.
07,2024
understood as Fourier transforms on certain graphs. Numer- N-dimensional vector x|= [x 1, f, x N ] <, with x i (also written
ous other types of multiresolution transforms and dictionaries sometimes as [x] i) representing the value of the signal at vertex
for data residing on graphs, trees, and compact manifolds were i. An example of a graph signal is shown in Figure 1.
investigated in the subsequent few years. These included lift- To gain some insight, consider the problem of studying
ing and pyramid transforms, graph filter banks, tight spectral Twitter patterns. Assume that we have N Twitter users: each
frames, vertex-frequency transforms that generalized the clas- vertex i ! V represents a user i, and each edge e = (i, j) ! E
sical short-time Fourier transform, and learned dictionaries (see captures that two users i and j follow each other. The data, x i,
[24] and [25] for a more complete literature review and list of indexed by node i could, e.g., be the number of tweets that user
references). GSP arose from these different fields, coalescing i tweeted in a given time interval. In a second application, to
multiple perspectives into a common framework and set of understand traffic flow in cities, we can examine the number of
ideas. In the last decade, this unifying framework has evolved pickups of for-hire vehicles (e.g., taxis, Uber or Lyft cars, and
into a full-fledged theory and technology. so on) over a given time period. The graph G can be the city
road map, with the vertices i ! V representing intersections,
The theoretical underpinnings and the edges e ! E representing road segments between
Ten years ago, [1], [2], [3], and [4] introduced the field of GSP intersections. The data x i at each vertex i might, e.g., be the
and established many of its foundations. Remarkably, these number of pickups close to that intersection over the time peri-
works approached the problem from two different perspectives. od of interest. The graphs G in such real-world applications
Inspired by graph theory and harmonic analysis, the authors of can be modeled as undirected (if (i, j) ! E, then ( j, i) ! E),
[1] and [2] use the graph Laplacian as the core of their theory, or directed (e.g., to capture one-way streets).
naturally generalizing concepts such as frequencies and filter Classical SP signals such as audio and image signals that
banks to the graph domain. Differently, the authors of [3] and reside on Euclidean domains can also be viewed as graph
[4] follow an algebraic approach, under which the multiplica- signals. Consider for instance, finite-length discrete-time 1D
tion of a graph signal by the adjacency matrix of the supporting signals, e.g., the N vertices of the graph are the time instances
graph yields the most basic operation of shift for a graph signal. i = 0, f, N - 1, with N being the window length. As the signal
Based on this simple operation, more advanced tools such as value x i + 1 at time i + 1 is usually closely related to the signal
filtering, graph Fourier transforms (GFTs), graph frequency, value x i at the preceding time, there is a directed edge from ver-
or total variation can be generalized to the vertex and spec- tex i to vertex i + 1. At i = N - 1, there are different options for
tral graph domains. Rather than being considered competing the boundary conditions; here, we consider the periodic bound-
approaches, these works brought complementary views and ary condition, which means that the time instant “next” to the
tools and, jointly, contributed to increasing the attention on the terminal instant N - 1 is i = 0. The resulting “time graph” is
field. After introducing some common notations, this section then a directed cycle G dc (see Figure 2). By similar reasoning,
reviews these two approaches and then explains how they were vertices in the image graph represent the pixels, and because
merged into an integrated framework that facilitated drawing the image brightness or color x i, j at pixel (i, j) is usually highly
links with classical SP and propelled the growth of GSP. related to the brightness or colors of its four neighboring pixels,
there are undirected edges from (i, j) to its neighboring pixels.
Basic definitions and notational conventions The corresponding graph is then an undirected 2D lattice.
The goal in GSP is to leverage SP and graph theory tools to At the core of GSP are N # N matrices that encode the
analyze and process signals defined over a network domain, graph’s topology. The most prominent are 1) the weighted
with notable examples including technological, social, gene, adjacency matrix A, whose (i, j)-entry is the edge weight
brain, knowledge, financial, marketing, and blog networks. ~ ((i, j)) if (i, j) ! E and zero otherwise; 2) the combinato-
In these setups, graphs are used to both index the data and rial (or nonnormalized) graph Laplacian L|= D - A, where
represent relations/similarities/dependencies among the loca- D = diag (A1) is the diagonal matrix of vertex degrees (sums
tions of the data. We denote the underlying weighted graph of the weights of the edges adjacent to each vertex) and 1 is an
by G = (V, E, ~), where V|= {1, f, N} denotes the set of N # 1 vector of all ones; and 3) the normalized graph Lapla-
N graph vertices; E 1 V # V denotes the set of graph edges; cian L norm|= D - (1/2) LD - (1/2) . We elaborate on the role of these
and ~ : E " R is a weight function that assigns a real-valued matrices in the next section.
weight to each edge, with a higher edge weight representing
a stronger similarity or dependency between the two vertices The spectral approach for GSP
connected by that edge. A graph with edge weights all equal Classical Fourier analysis of a 1D signal decomposes the sig-
to one is called unweighted. A graph signal contains informa- nal into a linear combination of complex exponential functions
tion associated with each vertex of the graph. For simplicity, we (continuous or discrete) at different frequencies, with increasing
focus our discussion on scalar, real-valued graph signals (each frequencies corresponding to higher rates of oscillation and ba-
signal is a mapping from V to R), but the values associated sis functions that are less smooth. The spectral approach to GSP
with each node could be discrete, complex, or even vectors (e.g., [1], [2] generalizes this classical Fourier analysis by ­writing
when multiple features per node are observed). Each ­scalar, graph signals as linear combinations of a basis of graph signals
real-valued graph signal can equivalently be represented as an with the property that the basis vectors can be (roughly) ordered

IEEE SIGNAL PROCESSING


d licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR. MAGAZINE
Downloaded | June 2023
on November | at 11:59:24 UTC from IEEE Xplore. Restrictions apply.
07,2024 51
according to how fast they oscillate across the graph, or, related, The Laplacian operator introduces a measure of smooth-
how smooth they are with respect to the underlying graph struc- ness for a graph signal x, through the graph Laplacian qua-
ture. By “smooth” in this context, we mean that the values of dratic form
the graph signal at each pair of neighboring vertices are similar.
The operator that captures this notion of smoothness with
x < Lx = / A i, j (x i - x j) 2 (1)
(i , j ) ! E
respect to the underlying (undirected) graph is the graph
Laplacian L. It is a discrete difference operator as we have which penalizes large differences between signal values at
strongly connected vertices. Because v <, Lv , = m ,, it is then clear
N
[Lx] i = / A i,j (x i - x j) = / A i, j (x i - x j) from (1) that the larger the graph frequency m ,, the less smooth
j=1 j ! Ni (or more variable) the graph Laplacian eigenvector v , . So, with
the indexing convention 0 = m 0 1 m 1 # g # m N - 1, the graph
where N i is the neighborhood of node i and A i, j is the (i, j)- frequency vectors {v ,} ,N=-01 are ordered according to increasing
entry of the adjacency matrix A. Because L is a real symmet- variability (see Figure 1). Using the Laplacian eigenvectors as
ric matrix, it has a set of orthonormal eigenvectors {v ,} ,N=-01 the basis, we can now define a GFT as V < . It transforms a graph
and a set of real nonnegative eigenvalues {m ,} ,N=-01 . Assuming a signal x into its frequency components as xt = V < x.
connected graph, it can further be shown that there is only one Graph filters can then be interpreted as operators that mod-
eigenvalue zero, e.g., m 0 = 0, with corresponding eigenvector ify the different frequency components of a signal x individu-
v 0 = 1/ N . In matrix form, we obtain L = Vdiag (m) V <, ally. That is, the graph filter operation can be represented in the
with V = [v 0, f, v N - 1] and m = [m 0, f, m N - 1] T . graph Fourier domain by H : R " R so that [yt ] , = H (m ,) [xt ] , .
Importantly, the graph Laplacian can also be viewed as a In most cases, the spectral function H (oftentimes referred to
graph extension of the time-domain Laplacian operator 2 2 /2t 2 . as a kernel) is set to a prespecified analytical form (typically
Just as the 1D complex exponentials—the eigenfunctions of parametric) that promotes certain properties in the output sig-
the time-domain Laplacian operator—capture a notion of fre- nals [e.g., rectangular kernels promote smoothness and remove
quency, we can interpret the graph Laplacian eigenvectors as noise (see Figure 1)]. However, nonparametric approaches
graph frequency vectors, with the associated graph Laplacian can also be used. Equally as important, Shuman et al. [2] also
eigenvalues capturing a notion of the rate of oscillation [2]. illustrate how graph filters can be used to interpolate missing

1 1
3
0.8 0.8

0.6 2 0.6

0.4 0.4
1
0.2 0.2

0 0
0 1 2 3 4 5
(a) (b) (c)

λ0 = 0 λ 1 = 0.11 λ 2 = 0.3

0.4
0.2
0
–0.2
–0.4
–0.6

(d)

FIGURE 1. (a) An example of a graph with a color-coded graph signal on top. (b) The signal in the graph frequency domain and in red the frequency
response of a potential low-pass graph filter. (c) The filtered graph signal. (d) The first three eigenvectors of the graph Laplacian ordered with decreasing
smoothness (increasing eigenvalue).

52 limited to: INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR.


d licensed use IEEE SIGNAL PROCESSING MAGAZINE
Downloaded | June 2023
on November | at 11:59:24 UTC from IEEE Xplore. Restrictions apply.
07,2024
v­ alues, and to design signal dictionaries whose atoms concen- A $ p (A) = p (A) $ A. Apart from the theoretical motivation,
trate their energy around a few frequencies or vertices, high- the polynomial definition exhibits a number of advantages.
lighting their relevance in a number of applications. When applied to a graph signal x, the operation Ax can be
understood as a local linear combination of the signal values at
The algebraic approach for GSP one-hop neighbors. Similarly, A 2 x is a local linear combina-
In classical SP, convolution is a key building block pres- tion of Ax, reaching values that are in the two-hop neighbor-
ent in many algorithms, including filtering, sampling, and hood. From this point of view, a graph filter p (A) represented
­interpolation. In defining convolution and filtering, the time by a polynomial of order L is mixing values that are at most L
shift, that is, the unit delay that transforms a signal into a de- hops away, with the polynomial coefficients {p l} lL= 0 represent-
layed version of itself, plays a critical role. The output of a lin- ing the strength given to each of the neighborhoods. Another
ear time-invariant filter is a weighted linear combination of advantage is that if A is set to A dc (the graph representing the
delayed versions of the input. Similarly, the discrete Fourier support of classical time signals), the graph polynomial defini-
transform (DFT) can be understood as the transformation that tion p (A dc) reduces to the classical time-shift definition p (z -1)
diagonalizes every linear time-invariant filter and provides an so that graph filters become linear time-invariant filters.
alternative description for signals and filters. To address the second question, [3] defines the GFT as the
In extending these ideas to GSP, the two key contributions linear transform that diagonalizes these graph filters of the
of [3] and [4] are 1) highlighting the relevance of defining a form p (A). Letting A = Vdiag (m) V -1 be the eigendecom-
“graph-aware” operator that plays the role of the “most basic position of the (possibly directed) adjacency matrix A, then
operation” to be performed on a signal x defined over a graph p (A) = p (Vdiag (m) V -1) = V (p (diag (m)) V -1 (note that we
G; and 2) setting this operation as Ax, i.e., the multiplication use V -1 now instead of V < because the eigenvectors are not
of the graph signal x by the adjacency matrix A of G. The necessarily orthonormal as for the Laplacian). In other words,
motivation for the latter choice is twofold. First, A is a simple matrix polynomials can be understood as operators that
(parsimonious and linear) operator that combines the values transform the input by 1) multiplying it by the matrix V -1,
of x in a manner that accounts for the local connectivity of G. 2) ­applying a diagonal operator p (diag (m)), and 3) transform-
Second, when particularized to time-varying signals defined ing the result back to the vertex domain with a multiplication
over the directed cycle G dc, using A dc x is equivalent to the by V. The GFT of a graph signal and the signal spectral repre-
classical unit delay, i.e., [A dc x] i + 1 = [x] i . sentation is then set as the multiplication by V -1, and the fre-
How can this basic, graph-aware operator be leveraged to quency response of a filter is found by calculating p (diag (m))
design 1) linear graph filters that are applied to a graph signal (similar to the spectral approach description in the previous
to generate another graph signal and 2)
linear transforms that provide an alter-
native representation for a graph sig-
nal? In classical SP, the basic, nontrivial 1
operation applied to a signal is the unit
0.5
delay (time shift); in other words, the Spectrum
-1
simplest filter is the time-shift filter z .
Im

0
Because graphs are finite, we consider
DSP with finite signals, and, for sim- –0.5
plicity, with periodic signal extensions.
–1
Generic linear filters are then polyno-
mials of this basic operator of the form –1 –0.5 0 0.5 1
-1 - ( N - 1) Time-to-Graph Domain Re
p (z) = p 0 + p 1 z + g + p 1 z ,
with z -l being the consecutive appli-
1
cation of the operator z -1 to a time
signal l times. DSP polynomial filters 0.5
Spectrum
are shift invariant in the sense that
Im

0
z -1 $ p (z) = p (z) $ z -1 .
Hence, to address the first ques- –0.5
tion, [3] sets the simplest signal
operation in GSP as multiplication –1
by the adjacency matrix A and, sub- –1 –0.5 0 0.5 1
sequently, defines graph filters as Re
(matrix) polynomials of the form (a) (b)
p (A) = p 0 I N + p 1 A + g + p 1 A (N - 1) .
It is easy to see that polynomial fil- FIGURE 2. (a) From the directed cycle representing the time domain to a general graph. (b) Eigenvalues
ters are A invariant, in the sense that (spectrum) of the related adjacency matrices.

IEEE SIGNAL PROCESSING


d licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR. MAGAZINE
Downloaded | June 2023
on November | at 11:59:24 UTC from IEEE Xplore. Restrictions apply.
07,2024 53
section). From the GFT of the signal, common SP concepts can graph/network contexts [3], [26], [27]. The particular solution
now be defined in GSP [4], including ordering graph frequen- obtained for any of these tasks depends on the GSO at hand as
cies from low and high graph frequencies, or designing low- well as the assumptions on the graph filter. For example, if the
and high-pass graph filters. Figure 2 shows the generalization goal is to estimate the graph-based linear mapping from a set
of the time domain to a more general graph domain. The appli- of input–output pairs collected in matrices X = [x 1, f, x M]
cations in [3] to data prediction, graph signal compression, data and Y = [y 1, f, y M], one requires M = N input–output pairs
classification, and customer behavior prediction for service if no structure is assumed for H, and a single M = 1 pair if
providers, and in [4] to filter design and malfunction detection one assumes that H is a graph filter. Furthermore, defining the
in sensor networks show the breadth of application domains. counterparts of classical finite-impulse response (FIR) and
infinite-impulse response (IIR) filters as H FIR (S) = R lL=-01 b l S l
The benefits of a joint framework and H IIR (S) = (R lL=-01 a l S l) -1, respectively, identifying such
Although having different origins, the approaches in [1] and filters from input–output observations is feasible, even if only
[2], and in [3] and [4] bring complementary perspectives. The a subset (with cardinality larger than 2 L) of the signal values
work in [1] and [2] relies on the graph Laplacian to capture is observed [27]. Additionally, using the definitions in (2), it is
the structure of G, uses its eigendecomposition to character- not difficult to show that any cascade/parallel/feedback con-
ize graph signals and define filtering operations, and draws nection of graph filters can also be written as a graph filter,
clear links with existing graph-based techniques in a number opening the door to make and exploit connections between
of applications. In [3] and [4], the focus is on the shift opera- graph-network processes and classical tools in control.
tion in the vertex domain, postulating the use of the adjacency A natural next step is to use (2) to model certain proper-
matrix as the building block to design GSP algorithms, and ties of classes of graph signals of interest. To be more specific,
unveiling a number of similarities with classical SP. Although consider that we model a graph signal x ! R N from a class
some early works mixed the features of [1] and [2], and of [3] of interest as x = H (S) z, with z being a hidden seed signal
and [4] (e.g., the use of polynomials based on the Laplacian and H (S) a generative graph filter that “transfers” some of the
matrix), the publication of these four papers and related works properties of S to x. Although mathematically simple, mod-
led to the emergence of works that combine both approaches eling graph signals as x = H (S) z has proven to be fruitful.
under a common framework. One way to do so is to define A typical approach is to assume some parsimonious struc-
a generic “graph-shift operator” (GSO) that plays a dual role: ture on either z, the filter H (S), or both, and then analyze the
1) it can be viewed as the most basic operation applied to a impact of those assumptions on the properties of x. Standard
graph signal, and 2) it codifies the structure of the graph in a assumptions have included H (S) being a band-limited graph
more generic way than L or A so that it can be used to tackle filter so that x is graph-band limited [28], H (S) being low pass
a broader range of setups. Under this framework, the linear so that x is smooth [2], [29], [30], z being a white signal so
GSO S ! R N # N has been set to different adjacency matrices that x is graph stationary [31], [32], or z being sparse so that
(e.g., one and two hops), different graph Laplacians (e.g., com- x is a diffused graph signal [33], as well as combinations of
binatorial, normalized, and random walk), the precision matrix those. More importantly, the combination of the generative
of a Gaussian–Markov random field, or even combinations of model x = H (S) z and one or more of the previous structural
those. Based on the eigendecomposition of this operator, given assumptions have been leveraged to successfully generalize a
by S = Vdiag (m) V -1, linear graph filtering can be equiva- number of estimation and learning tasks to the graph domain.
lently understood as an operator that is linear and orthogonal Early examples investigated in the literature included signal
(diagonal) in the frequency domain defined by V -1, or as the denoising, sampling and interpolation, input identification,
multiplication by a matrix that is a linear combination of suc- blind deconvolution, dictionary design, SSL, classification,
cessive applications (powers) of the GSO S: and the generalization of stationarity to graph domains (see,
e.g., [24] for a detailed review). Although covering all of these
N-1
tasks goes beyond the scope of this article, we next discuss
H (S) = Vdiag (ht ) V -1 or H (S) = / h l S l (2) three illustrative milestones: 1) sampling and interpolation, 2)
l=0
source identification and blind deconvolution, and 3) statistical
where the H (S) notation is used to emphasize the dependence descriptions of random graph signals.
on the GSO S. The first definition in (2) focuses on the frequen- We start with a simple sampling and interpolation setup that,
cy domain, with the filter parameters being the N-dimensional due to its practical relevance, received early attention from mul-
frequency response ht = [ht 0, f, ht N - 1] < . The second defini- tiple research groups [34]. Consider the sampling set M 3 V
tion in (2) focuses on the vertex domain, with the parameters of with cardinality M # N, and define the selection matrix
the filter being the N filter taps h = [h 0, f, h N - 1] < . Although U M ! {0, 1} M # N as the M rows of the N # N identity matrix
we focus on degree N – 1 polynomials, thanks to the Cayley– indexed by the set M. The sampled signal x M|= U M x collects
Hamilton theorem, the definition in (2) can represent a matrix the values of the graph signal x at the vertex set M. The goal is
polynomial of any degree [3]. With these models at hand, the to use x M, along with S, to recover x, leveraging the structure of
literature promptly addressed tasks such as prediction, classi- the graph. As the problem is ill-posed, we need to assume and
fication, compression, filter identification, and filter design in enforce some structure on x. Two widely adopted approaches

54 limited to: INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR.


d licensed use IEEE SIGNAL PROCESSING MAGAZINE
Downloaded | June 2023
on November | at 11:59:24 UTC from IEEE Xplore. Restrictions apply.
07,2024
are to 1) assume that x is K-band-limited, i.e., it is in the span cal properties of the graph signal, and establishing meaning-
of the first K eigenvectors of S, for some K 1 N, or 2) assume ful links with Gaussian–Markov random fields that assume
that the signal x is smooth with respect to the underlying graph, S = C -x 1 . With this definition, counterparts to concepts and
which can be generically modeled as the norm of x - H (S) x tools such as the power spectral density, periodogram, Wiener
being small, where H (S) is a low-pass filter tuned to promote filter, and autoregressive moving average models were devel-
a particular notion of smoothness. We denote the subspace of oped [31]. These developments provide new ways to design
K-band-limited signals by X (VK )|= {VK b for all b ! R K }. graph-based covariance estimators and denoise graph signals
The statement that x ! X (VK ) is equivalent to saying that x as well as a rigorous framework to better model, understand,
is generated via a graph filter with ht = [1 <K , 0 <N - K ] < . These and control random processes residing on a graph.
two alternative assumptions lead to the following optimization We close this section by highlighting that, although some
problems for interpolation, respectively: instances of the problems discussed had been investigated
well before the GSP framework was put forth (e.g., denois-
2
x ) = argmin x M - U M x 2 or ing based on smooth priors given by powers of the Laplacian,
x ! X (V K )
2 2
 or source identification based on graph-diffusion processes),
x ) = argmin x M - U M x 2 + a (I - H (S)) x 2 (3) those early works were mostly disconnected and focused on
x
particular setups. The advent of GSP and use of a common
with the weight a controlling the trade-off between minimiz- language and theoretical framework served a number of pur-
ing the smoothness of x ) and how similar x ) and x are for the poses: 1) facilitating the identification of connections between
nodes in M. For band-limited signals, if M $ K and (U M VK ) and differences among existing works, 2) bringing differ-
is full rank, the signal x can be identified from its samples x M via ent research communities together, 3) enabling the design of
x = VK (U M VK ) @ x M [28]. Although this is also true for time sig- more complex processing architectures that use early works
nals, other popular results in classical SP, such as ideal low-pass fil- as building blocks, 4) providing a new set of tools for graph
ters being the optimal interpolators or regularly spaced sampling signals based on the generalization of classical SP schemes
being optimal, do not hold true for the graph domain due to the lack of to the graph domain, and 5) aiding the development of novel,
regularity in G. Regarding the second optimization problem in (3), theoretically grounded solutions to graph-based problems that
the solution is x ) = (U <M U M + a (I - H (S) H < (S)) -1 U <M x M . had been solved in a heuristic manner.
In this case, we can interpret U <M x M as a zero-padded
graph signal that is smoothly diffused through the graph by The impact of GSP on data science
(U <M U M + a (I - H (S) H < (S)) -1 . GSP has transformed how the SP community deals with irreg-
Using the model x = H (S) z, source identification and ular geometric data; however, it has also contributed to areas
blind deconvolution have also been generalized to the graph that go beyond SP, having a significant impact on data science-
setting [33]. In both, the signal z is assumed to be sparse. For related disciplines. To illustrate this, we next review several of
source identification, given a sampled version of x, the goal is the data science problems where GSP-based approaches have
to identify the locations and nonzero values of z, which can made significant contributions.
be viewed as source nodes whose inputs are diffused through-
out the network represented by S. For blind deconvolution, the Graph learning
goal is to use x to identify both the sparse input z and the gen- The field of GSP was originally conceived with a given graph
erating filter H (S), with a classical assumption being that the (G or S) in mind. Such a graph could originate from a physi-
coefficients h are sparse, or that the filter has a parsimonious cal network, such as transportation, communication, social, or
FIR/IIR structure. Inspired by those works, generalizations structural brain networks. However, in many applications, the
were also investigated for demixing setups where the aggrega- graph is an implicit object that describes relationships or levels
tion of multiple signals is observed (e.g., the sum of several of association among the variables. In some cases, the links
network processes, each with different sources and dynamics). of those graphs can be based on expert domain knowledge
Our last example to illustrate the benefits of a common (e.g., activation properties in protein-to-protein networks),
GSP framework is the statistical description of random graph but in many other cases, the graph must be inferred from the
signals. Characterizing random processes is a challenging task data themselves. Examples include graphs for image process-
even for regular time-varying signals, with stationarity models ing where the edges are defined based on both pixel distance
excelling at finding a sweet spot between practical relevance and intensity differences, a k-nearest neighbor graph for SSL
and analytical tractability. With this in mind, multiple efforts where edges connect data points with similar sets of features,
were carried out to generalize the definition of stationarity to or correlation graphs for functional brain networks. In those
graph signals [31], [32]. The key step was to say that a zero- cases, the problem to solve can be formulated as “given a col-
mean random graph signal x is stationary in a normal GSO S lection of M graph signals X = [x 1, f, x M] ! R N # M, find an
if it can be modeled as x = H (S) z, with z being white. This is N # N sparse graph matrix S describing the relations among
equivalent to saying that the covariance matrix C x = E 6xx <@ the nodes of the graph.” Clearly, such a problem is severely
can be written as a polynomial of the GSO S, illustrating the ill-posed, and models used to relate the properties of the graph
relationship between the underlying graph and the statisti- and the signals are key to address it in a meaningful way.

IEEE SIGNAL PROCESSING


d licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR. MAGAZINE
Downloaded | June 2023
on November | at 11:59:24 UTC from IEEE Xplore. Restrictions apply.
07,2024 55
Learning a graph from data is a topic on its own, with roots developed for GSP can be used to accelerate spectral clustering
in statistics, network science, and ML (see [11] and references by avoiding k-means. Third, [40] uses spectral graph wavelets
therein). Initial approaches focused on the information associ- to learn structural embeddings that help identify vertices that
ated with each node separately, so that the existence of the have similar structural roles in the network, even though they
link (i, j) in the graph was decided based only on the ith and may be distant in the graph.
jth row of X. Contemporary (more advanced) approaches
look at the problem as finding a mapping from X to S, with SSL
graphical lasso (GL) being the most prominent example. GL The goal of SSL is to utilize a combination of labeled and unla-
is tailored for Gaussian–Markov random fields and sets the beled data to predict the labels of the unlabeled data points. The
graph to a sparsified version of the precision matrix so that labels may be discrete (semisupervised classification) or con-
S . ((1/M) XX <) -1 [11]. The contribution of GSP to the prob- tinuous (semisupervised regression). Many of the graph-based
lem of graph learning [30], [35] falls into this second class SSL methods (e.g., [14]) investigated by the ML community in
of approaches, where the more sophisticated (spectral and/ the early 2000s constructed an undirected, weighted-similarity
or polynomial) relationships between the signals and the graph, with each vertex representing one data point (either la-
graph can be fully leveraged. One cluster of early GSP works beled or unlabeled), and then diffused the known labels across
focused on learning a graph S that made the signals in X the graph to infer the labels at the unlabeled vertices. This ap-
smooth with respect to the learned graph [29]. If smoothness is proach can also be thought of as compelling the vector of la-
promoted using a Laplacian-based total-variation regularizer bels to be smooth with respect to the underlying graph. Math-
R mM= 1 x <m Lx m, the formulation leads to a kernel-ridge regres- ematically, this results in optimization problems with at least
sion problem with the pseudoinverse of L as the kernel, and two terms: a fitting term that ensures that the vector of labels
meaningful links with GL can be established [35]. A second exactly or approximately matches the known labels on the ver-
set of GSP-based topology inference methods model the data tices corresponding to the labeled data points, and a regulariza-
X as resulting from a diffusion process over the sought graph tion term of the form x < H (S) x for some (symmetric) GSO S
S through a graph filter. The key questions when modeling and (low-pass) graph filter H (S) that enforces global smooth-
the observations as x m = H (S) z m are then the assumptions ness of the signal [41] (as discussed in the “Graph Neural Net-
(if any) about the diffusing filter H (S) and the input signals works” section).
z m . Assuming the inputs z m to be white, which is tantamount Rather than enforcing global smoothness of the labels with
to assuming that the signals x m are stationary in S, leads to a respect to the underlying graph, another GSP approach to SSL
model where the covariance (precision) matrix of the observa- is to encourage the labels to be piecewise smooth with respect
tions is a polynomial of the sought GSO S, all having the same to the graph by modeling them as a sparse linear combination
eigenvectors. This not only provides a common umbrella to of graph wavelet atoms [42]. Regularization problems resulting
several existing graph-learning methods but also a new (spec- from this approach feature the same fitting term as mentioned
tral and/or polynomial) way to address graph estimation [36], previously, but the additional term in the objective function
[37]. Indeed, the fact that GSP offers a well-understood frame- captures the sparsity prior through the norm (or mixed norm)
work for modeling graph signals has propelled the inves- of the coefficients used to synthesize the labels as a linear com-
tigation of multiple generalizations of the aforementioned bination of the graph wavelets. Finally, in GSP parlance, SSL is
methods, tackling, e.g., directed graphs, causal structure iden- intimately related to graph signal interpolation so that most of
tification, presence of hidden nodes whose signals are never the results regarding the sampling and reconstruction of (band-
observed, dynamic networks, multilayer graphs, and nonlin- limited) graph signals, can be (and have been) applied to SSL.
ear models of interaction. The interested reader is referred to
[30] and the references therein for more details. Graph neural networks
Neural networks (NNs) are nonlinear data processing archi-
Network science tectures composed of multiple layers, each of which combines
As discussed in the previous section, advancements in network (mixes) the inputs linearly via matrix multiplication and then
science informed subsequent developments in GSP. It is now applies a scalar nonlinear function to each of the entries of the
also the case that GSP techniques have been used to address output. The values of the mixing matrices {H ,} ,L= 1 are consid-
network science problems such as clustering and community ered the parameters of the architecture. To avoid an excess of
mining. We mention three examples here. First, in [38], spec- parameters, a standard approach is to impose some parsimoni-
tral graph wavelets are utilized to develop a new, fast, multi- ous structure on the mixing matrices (e.g., Toeplitz, low-rank,
scale community mining protocol. Second, by graph-spectral and sparse), giving rise to different families of NNs. Given the
filtering random graph signals, feature vectors can be effi- success of NNs—and convolutional NNs in particular—in
ciently constructed for each vertex in a manner such that the processing regular data such as speech and images, a natural
distances between vertices based on these feature vectors re- question is how best to generalize these architectures to data
semble the distances based on standard spectral clustering fea- defined over irregular graph domains. In this context, the ML
ture vectors. In [39], a detailed account is provided of how that learning community investigated graph NNs that incorporate
approach and other new sampling and interpolation methods the graph (G or S) into NN architectures in different ways [6],

56 limited to: INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR.


d licensed use IEEE SIGNAL PROCESSING MAGAZINE
Downloaded | June 2023
on November | at 11:59:24 UTC from IEEE Xplore. Restrictions apply.
07,2024
[43]. GSP offers a principled way to address this question, pos- nal. It further enforces sparsity on all of the matrix weights.
tulating that the matrices {H ,} ,L= 1 have the form of a graph fil- In such structural VAR processes [50], the matrix weights can
ter {H , (S)} ,L= 1 . This offers both a flexible way to incorporate be viewed as graph-adjacency matrices that link the current
the graph (with the selection of the GSO S being application data on a node with past data on the same node as well as with
dependent) and also provides a range of options for parame- current and past data on neighboring nodes. Extensions to non-
terizing the graph filter (e.g., polynomial, rational, and diffu- linear versions have also been considered.
sion filters). Similarly, a number of generalizations and novel
architectures that leverage GSP have been proposed, includ- The value of GSP in science and
ing pooling schemes based on sampling over graphs, graph- engineering applications
recurrent NNs, architectures defined over product graphs, and Not surprisingly, GSP methods have been applied to engineer-
NNs based on graphon filters [44]. GSP has not only provided ing networks where a clear definition of the graph follows from a
a common framework to better understand the contributions of physical network. These include communication networks (e.g.,
and links between many of the existing works but has also fa- developing distributed schemes to estimate the channels), smart
cilitated novel contributions on subjects such as transferability, grids, power networks, (e.g., designing distributed resource al-
robustness, or sensitivity with respect to the graph [45]. location algorithms for power flow), water networks, and trans-
portation networks (e.g., developing graph-based architectures
Graph-time processing to predict traffic delay). Similarly, GSP has also contributed to
In many applications, a time series, as opposed to a scalar applications where the network is not explicitly observable but
value, is observed at each node of the graph G. If the length can be inferred from additional information, such as social net-
of each time series is T, the data at hand can be arranged in works, meteorological prediction, genetics, and financial engi-
the form of a matrix X = [x 1, f, x T ] ! R N # T , which can be neering. Although all of the previous examples are meaningful
viewed as a collection of N time series (one per node of the and relevant, here we briefly highlight the two areas with the
graph), a collection of T graph signals, or as a single signal largest and most consistent GSP activity over the past decade:
vec (X) ! R NT that varies across both time and the nodes of neuroscience and image and video processing.
the graph G. The first approaches to handle time-varying graph
signals were based on product graphs that combine a graph of Applications to neuroscience
the vertices with a graph for the time domain (e.g., a directed Graphs have a long history in neuroscience because they can
cycle graph G dc ) to obtain a single larger graph G # G dc with be used to represent different relationships and pairwise con-
NT nodes [46], [47]. This interpretation allows for the use of nections between regions of the brain, taking each region to
standard GSP tools such as the GFT transform and graph fil- be a vertex [15]. An anatomical brain graph captures struc-
ters, with the joint GFT being the Kronecker product of the tural connections between the regions, as measured, e.g., via
original GFT V -1 and the DFT matrix F H, and the joint GSO fiber tracts in white matter captured through diffusion mag-
some chosen product (e.g., Kronecker, Cartesian, and strong) netic resonance imaging (MRI). A functional brain graph, on
of the respective GSOs. Indeed, the joint spectrum of the time- the other hand, aims to capture pairwise interdependencies
varying graph signal vec (X) can be analyzed this way, and between activity that is measured in the different brain re-
joint, graph-time filters can be adopted for their denoising or gions. Identifying the functional brain graph has been studied
interpolation. In their most general form, those filters need not extensively for different reasons and with different modali-
be separable over the graph and time domains, thereby increas- ties, the most common of which is functional MRI (fMRI).
ing their modeling and processing potential. Often, such studies also involve the estimation of dynamic
Later, vector autoregressive (VAR) processes were consid- graphs [51], [52]. During a sequence of task and rest periods,
ered for graph-time processing. A VAR models a vector process it has, for instance, been shown that on- and off-task func-
by expressing the current vector as a matrix-weighted version of tional brain graphs differ substantially [51]. Recent work also
past vectors plus some innovation, i.e., x t = R Pp = 1 A p x t - p + e t . demonstrates that dynamics in the functional brain graph
Considering that the vectors we are handling are graph sig- even exist during resting-state fMRI, with meaningful cor-
nals, the underlying graph structure can be incorporated in relations with electroencephalograph, demographic, and be-
such VAR models in different ways, leading to different GSP havioral data [52].
extensions. One direction is to replace the matrix weights by Interestingly, most of the graph-based approaches in neu-
graph filters, i.e., A p = H p (S), leading to graph VAR process- roscience consist of first identifying a brain graph and then
es [48]. In such models, the graph filter can be implemented using graph-theoretical and network science tools to analyze
in the graph frequency domain or as a polynomial of the GSO its properties. From this point of view, GSP tools can be (and
in the vertex domain. Furthermore, causal models have been have been) leveraged for learning brain graphs [53]. However,
assumed where the polynomial order of the graph filter cannot GSP really shines when it comes to analyzing how the mea-
be larger than the time delay on which the filter operates [49]. sured activity pattern—the brain signal—behaves in rela-
Another extension of VAR models also considers the inter- tion to a brain graph (either anatomical or functional, related
action between the different nodes of the current vector, i.e., to one or multiple subjects). In other words, GSP provides
x t = A 0 x t + R Pp = 1 A p x t - p + e t, where A 0 has a zero diago- a technology to merge the brain function, contained in the

IEEE SIGNAL PROCESSING


d licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR. MAGAZINE
Downloaded | June 2023
on November | at 11:59:24 UTC from IEEE Xplore. Restrictions apply.
07,2024 57
brain signal, with the brain graph (see [53] and references One emerging area in the field of GSP is dynamic graphs;
therein). Specifically, the GFT has been used to analyze cog- more specially, how to estimate them, and how to process
nitive behavior. For example, [54] shows that that there is a time-varying graph signals residing on them. Graphs are rarely
relationship between the energy of the high-frequency con- static; think, for instance, about social networks with new users
tent of an fMRI signal and the attention-switching ability of or changing connections, or functional brain networks deter-
an individual. There is further research from the same group mined by a specific task that is carried out at a particular time
that states that, when learning a task, the correlations between instant. As a result, GSP tools, theory, and algorithms need
the learning rate and the energies of the low-/high-frequency to be extended to such scenarios. There is already quite some
content of an fMRI signal change with the exposure time, i.e., work on graph topology identification for dynamic graphs,
they depend on how familiar we are with the task. In addi- but most of these methods link consecutive graphs in the cost
tion to the GFT, graph wavelets and Slepians have been used function, making the problems computationally challenging
to reveal localized frequency content in the brain [53], and [30], [50]. Adaptive methods (of the correction-only or pre-
graph filters have been used as diffusion operators to model diction-correction type) try to tackle this issue, but tracking
disease progression in dementia. Although these results dem- rates are still low. Processing signals residing on time-varying
onstrate the potential GSP has for neuroscience, we believe graphs have not been studied in depth, and this is clearly an
this pairing is still in its infancy, and that there is plenty of area where many opportunities arise.
room for exploration. Extending GSP to higher-order graphs is another important
future direction. Some applications are characterized by a graph
Applications to image and video processing domain where more than two nodes can interact; think, for
As noted earlier, widely used techniques in image and video instance, about a coauthorship network where groups of coau-
processing, including transforms such as the DCT and the thors who collaborated on a paper are linked together, or about
KLT, segmentation methods, and image filtering can be in- movie graphs in recommender systems, where movies starring
terpreted from a GSP perspective [55]. In recent years, the the same actor form a group. Such graphs where an edge can
emergence of a broader understanding of GSP has led to a join more than two nodes are called higher-order graphs. Popu-
further evolution of how graph-based approaches are used for lar abstractions of higher-order graphs are simplicial complexes
image processing. As an example, although the DCT or asym- and cell complexes. A simplicial/cell complex is a collection
metric discrete sine transform are formed by the eigenvectors of subsets of the set of nodes satisfying certain properties.
of path graphs with equal edge weights, extensions have been Whereas in a simplicial complex, the subsets satisfy the subset
proposed where graph edges with lower weights can be in- inclusion property (e.g., there needs to be links among each pair
troduced in between pixels corresponding to image contours of the three coauthors of a paper), in a cell complex, they do
[56]. In these approaches, as in input-dependent image filter- not. However, both types of complexes share a similar recur-
ing [57], the image structure is first analyzed (e.g., contours sive relationship between the higher-order Laplacians, leading
detected), and then transforms adapted to the image charac- to a hierarchical processing architecture that can process node
teristics are selected, with the choice of transform sent as side signals over edges, edge signals over triangles/polygons (for a
information. simplicial/cell complex), and so on. A less restrictive represen-
A particularly promising application of GSP methods is to tation of a higher-order graph is a hypergraph H = (V, E, ~),
point cloud processing and compression. Each point in a point where ~ is a function that assigns a weight to each hyperedge
cloud is defined by its coordinates in 3D space and has associ- in E. Hyperedges can connect more than two vertices in V.
ated with it an attribute (e.g., color or reflectance). Although Some recent overviews on higher-order networks, with focuses
points are in a Euclidean domain, their positions, on the sur- on GSP and network science, respectively, can be found in [60]
faces of the objects in the scene, are irregular and make it natu- and [61]. There are still many open issues in higher-order GSP,
ral to develop a graph-based processing approach. Transforms including the exploration of connections to adjacent fields such
have been proposed that leverage or are closely related to the as topological data analysis and computational geometry.
GFT of a point cloud graph [58]. These methods are funda- Many other open problems—extending GSP to include
mental algorithms for geometry-based point cloud compres- uncertainty in the signals and graphs, design of exact and
sion. Additionally, point cloud processing has become a major approximate Bayesian (recursive) estimators able to track
application domain for graph ML, with applications in areas variations across nodes and time, developing GSP models for
such as denoising [59]. categorical data, generalizing GSP results to continuous mani-
fold (geometric) data, incorporating GSP tools into reinforce-
The future ahead ment learning and spatiotemporal control, and so on—are also
The focus of this article has been on reviewing the early results expected to play important roles in the future of the discipline.
and growth of GSP, with an eye not only on the SP commu- If the first years of GSP combined theoretical developments
nity but also the applications and data science problems that with practical applications by placing a stronger focus on the
have benefited from GSP. We close by discussing some of the former, we expect that the coming years will see an increased
emerging directions and open problems that we believe will emphasis on applications, along with important efforts on
shape the future of the field. learning and statistical schemes.

58 limited to: INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR.


d licensed use IEEE SIGNAL PROCESSING MAGAZINE
Downloaded | June 2023
on November | at 11:59:24 UTC from IEEE Xplore. Restrictions apply.
07,2024
Acknowledgments ber of the Portugal Academy of Sciences and the U.S.
An extended version of this article (including additional National Academy of Engineering.
references) is available at http://arxiv.org/abs/2303.12211. Antonio Ortega ([email protected]) received
G. Leus is partially supported by the TTW-OTP project his Ph.D. degree in electrical engineering from Columbia
GraSPA (project number 19497) financed by the Dutch Re- University. He is Dean’s Professor of Electrical and Computer
search Council (NWO). A. G. Marques acknowledges the Engineering, at the University of Southern California, Los
support of the Spanish NSF grant PID2019-105032GB-I00/ Angeles, CA 90089 USA. He has received several paper
AEI/10.13039/501100011033. The work of J. M. F. Moura was awards, including the 2016 Signal Processing Magazine
partially supported by NSF Grant CCN 1513936. award. He is the author of the book, “Introduction to Graph
Signal Processing,” published by Cambridge University Press
Authors in 2022. He was editor-in-chief of IEEE Transactions of
Geert Leus ([email protected]) received his Ph.D. degree Signal and Information Processing over Networks and served
in electrical engineering from KU Leuven in 2000. He is a on the Board of Governors of the IEEE Signal Processing
professor at the Delft University of Technology, 2628CD Society. His recent research work focuses on graph signal
Delft, The Netherlands. He serves as chair of the EURASIP processing, machine learning, and multimedia compression.
Signal Processing for Multisensor Systems Technical Area He is a Fellow of IEEE and the European Association for
Committee and editor-in-chief of EUR ASIP Signal Signal Processing.
Processing. He received the 2021 EURASIP Individual David I Shuman ([email protected]) received his Ph.D.
Technical Achievement Award, a 2005 IEEE SPS Best Paper degree in electrical engineering systems from the University
Award, and a 2002 IEEE SPS Young Author Best Paper of Michigan in 2010. He is a professor of data science and
Award. He served as a member-at-large on the Board of applied mathematics at Franklin W. Olin College of
Governors of the IEEE Signal Processing Society, chair of Engineering, Needham, MA 02492 USA. He is an associate
the IEEE International Conference on Signal Processing and editor of IEEE Transactions on Signal Processing and has
Communications Technical Committee, and editor-in-chief received multiple IEEE best paper awards. His research inter-
of EURASIP Journal on Advances in Signal Processing. He ests include signal processing on graphs, computational har-
is a Fellow of IEEE and the European Association for monic analysis, and stochastic scheduling problems.
Signal Processing.
Antonio G. Marques ([email protected]) References
received his doctorate degree in telecommunications engineer- [1] D. K. Hammond, P. Vandergheynst, and R. Gribonval, “Wavelets on graphs via
spectral graph theory,” Appl. Comput. Harmon. Anal., vol. 30, no. 2, pp. 129–150,
ing (with highest honors) from Carlos III University of Mar. 2011, doi: 10.1016/j.acha.2010.04.005.
Madrid in 2007. He is a professor with the Department of [2] D. I Shuman, S. K. Narang, P. Frossard, A. Ortega, and P. Vandergheynst,
Signal Theory and Communications, King Juan Carlos “The emerging field of signal processing on graphs: Extending high-dimen-
sional data analysis to networks and other irregular domains,” IEEE Signal
University, 28942 Madrid, Spain. He has received multiple Process. Mag., vol. 30, no. 3, pp. 83–98, May 2013, doi: 10.1109/MSP.2012.
paper awards and was also a recipient of the 2020 EURASIP 2235192.

Early Career Award. His research interests lie in the areas of [3] A. Sandryhaila and J. M. F. Moura, “Discrete signal processing on graphs,”
IEEE Trans. Signal Process., vol. 61, no. 7, pp. 1644–1656, Apr. 2013, doi:
signal processing, machine learning, and network science. He 10.1109/TSP.2013.2238935.
is a Member of IEEE, the European Association for Signal [4] A. Sandryhaila and J. M. F. Moura, “Discrete signal processing on graphs:
Frequency analysis,” IEEE Trans. Signal Process., vol. 62, no. 12, pp. 3042–3054,
Processing, and the European Laboratory for Learning and Jun. 2014, doi: 10.1109/TSP.2014.2321121.
Intelligent Systems Society. [5] L. Stanković et al., Data Analytics on Graphs, Boston, MA, USA: Now
José M.F. Moura ([email protected]) received his Publishers, 2021.
D.Sc. degree in electrical engineering and computer science [6] M. M. Bronstein, J. Bruna, Y. LeCun, A. Szlam, and P. Vandergheynst,
“Geometric deep learning: Going beyond Euclidean data,” IEEE Signal Process.
from the Massachusetts Institute of Technology. He is the Mag., vol. 34, no. 4, pp. 18–42, Jul. 2017, doi: 10.1109/MSP.2017.2693418.
Philip Marsha Dowd University Professor, the Department of [7] C. Godsil and G. Royle, Algebraic Graph Theory. Berlin, Germany: Springer-
Electrical and Computer Engineering, Carnegie Mellon Verlag, 2001.
[8] D. M. Cvetković, M. Doob, and H. Sachs, Spectra of Graphs: Theory and
University (CMU), Pittsburgh, PA 15213 USA. His patented Application. New York, NY, USA: Academic, 1980.
detector (co-inventor Alek Kavcic) is in more than 60% of [9] G. Taubin, “A signal processing approach to fair surface design,” in Proc. 22nd
computers sold in the last 18 years (4 billion). CMU settled Annu. Conf. Comp. Graph. Interactive Techn. (SIGGRAPH), 1995, pp. 351–358.
with Marvell its infringement for US$750 million. He was the [10] A. Elmoataz, O. Lezoray, and S. Bougleux, “Nonlocal discrete regularization
on weighted graphs: A framework for image and manifold processing,” IEEE Trans.
2019 IEEE president and CEO. He holds honorary doctorate Image Process., vol. 17, no. 7, pp. 1047–1060, Jul. 2008, doi: 10.1109/
degrees from the University of Strathclyde and Universidade TIP.2008.924284.

de Lisboa and has received the Great Cross and Order of [11] E. D. Kolaczyk, Statistical Analysis of Network Data: Methods and Models.
New York, NY, USA: Springer-Verlag, 2009.
Infante D. Henrique. He received the 2023 IEEE Kilby Signal [12] M. J. Wainwright and M. I. Jordan, “Graphical models, exponential families,
Processing Medal. His research interests include statistical, and variational inference,” Found. Trends® Mach. Learn., vol. 1, nos. 1–2, pp.
1–305, Nov. 2008, doi: 10.1561/2200000001.
distributed, and graph signal processing. He is a Fellow of
[13] M. Belkin and P. Niyogi, “Laplacian eigenmaps for dimensionality reduction
IEEE, the American Association for the Advancement of and data representation,” Neural Comput., vol. 15, no. 6, pp. 1373–1396, Jun. 2003,
Science, and the National Academy of Inventors, and a mem- doi: 10.1162/089976603321780317.

IEEE SIGNAL PROCESSING


d licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR. MAGAZINE
Downloaded | June 2023
on November | at 11:59:24 UTC from IEEE Xplore. Restrictions apply.
07,2024 59
[14] A. Smola and R. Kondor, “Kernels and regularization on graphs,” in Proc. F. Ros and S. Guillaume, Eds. Cham, Switzerland: Springer-Verlag, 2020,
Conf. Learn. Theory (COLT), Aug. 2003, pp. 144–158, doi: 10.1007/978-3-540- pp. 129–183.
45167-9_12. [40] C. Donnat, M. Zitnik, D. Hallac, and J. Leskovec, “Learning structural node
[15] O. Sporns, Networks of the Brain. Cambridge, MA, USA: MIT Press, 2010. embeddings via diffusion wavelets,” in Proc. 24th ACM SIGKDD Int. Conf. Knowl.
[16] C. Tomasi and R. Manduchi, “Bilateral filtering for gray and color images,” in D i s c o v e r y D a t a M i n i n g ( K D D ) , 2 018 , p p . 1 3 2 0 – 1 3 2 9, d o i :
Proc. 6th Int. Conf. Comput. Vision (ICCV), Jan. 1998, pp. 839–846, doi: 10.1109/ 10.1145/3219819.3220025.
ICCV.1998.710815. [41] D. I Shuman, P. Vandergheynst, D. Kressner, and P. Frossard, “Distributed sig-
[17] D. Koller and N. Friedman, Probabilistic Graphical Models: Principles and nal processing via Chebyshev polynomial approximation,” IEEE Trans. Signal Inf.
Techniques. Cambridge, MA, USA: MIT Press, 2009. Process. Net w., vol. 4, no. 4, pp. 736 –751, Dec. 2018, doi: 10.1109/
TSIPN.2018.2824239.
[18] M. Newman, Networks. Oxford, U.K.: Oxford Univ. Press, 2018.
[42] D. I Shuman, M. Faraji, and P. Vandergheynst, “Semi-supervised learning with
[19] R. Pastor-Satorras and A. Vespignani, “Epidemic spreading in scale-free net- spectral graph wavelets,” in Proc. Int. Conf. Sampling Theory Appl. (SampTA), 2011.
works,” Phys. Rev. Lett., vol. 86, no. 14, Apr. 2001, Art. no. 3200, doi: 10.1103/
PhysRevLett.86.3200. [43] M. Defferrard, X. Bresson, and P. Vandergheynst, “Convolutional neural net-
works on graphs with fast localized spectral filtering,” in Proc. Int. Conf. Neural
[20] M. Crovella and E. Kolaczyk, “Graph wavelets for spatial traffic analysis,” in Inf. Process. Syst. (NeurIPS), 2016, pp. 3844–3852.
Proc. IEEE 22nd Annu. Joint Conf. IEEE Comput. Commun. Soc., 2003, pp.
[44] A. G. Marques, N. Kiyavash, J. M. F. Moura, D. V. D. Ville, and R. Willett,
1848–1857, doi: 10.1109/INFCOM.2003.1209207.
“Graph signal processing: Foundations and emerging directions,” IEEE Signal
[21] R. R. Coifman and M. Maggioni, “Diffusion wavelets,” Appl. Comput. Process. Mag., vol. 37, no. 6, pp. 11–13, Nov. 2020, doi: 10.1109/
Harmon. Anal., vol. 21, no. 1, pp. 53–94, Jun. 2006, doi: 10.1016/j. MSP.2020.3020715.
acha.2006.04.004.
[45] L. Ruiz, F. Gama, and A. Ribeiro, “Graph neural networks: Architectures, sta-
[22] M. Püschel and J. M. F. Moura, “Algebraic signal processing theory: bility, and transferability,” Proc. IEEE, vol. 109, no. 5, pp. 660–682, May 2021, doi:
Foundation and 1-D time,” IEEE Trans. Signal Process., vol. 56, no. 8, pp. 3572– 10.1109/JPROC.2021.3055400.
3585, Aug. 2008, doi: 10.1109/TSP.2008.925261.
[46] A. Sandryhaila and J. M. F. Moura, “Big data analysis with signal processing
[23] M. Püschel and J. M. F. Moura, “Algebraic signal processing theory: 1-D on graphs: Representation and processing of massive data sets with irregular struc-
space,” IEEE Trans. Signal Process., vol. 56, no. 8, pp. 3586–3599, Aug. 2008, doi: ture,” IEEE Signal Process. Mag., vol. 31, no. 5, pp. 80–90, Sep. 2014, doi:
10.1109/TSP.2008.925259. 10.1109/MSP.2014.2329213.
[24] A. Ortega, P. Frossard, J. Kovačević, J. M. F. Moura, and P. Vandergheynst, [47] F. Grassi, A. Loukas, N. Perraudin, and B. Ricaud, “A time-vertex signal pro-
“Graph signal processing: Overview, challenges, and applications,” Proc. IEEE, vol. cessing framework: Scalable processing and meaningful representations for time-
106, no. 5, pp. 808–828, May 2018, doi: 10.1109/JPROC.2018.2820126. series on graphs,” IEEE Trans. Signal Process., vol. 66, no. 3, pp. 817–829, Feb.
[25] D. I Shuman, “Localized spectral graph filter frames: A unifying framework, 2018, doi: 10.1109/TSP.2017.2775589.
survey of design considerations, and numerical comparison,” IEEE Signal [48] E. Isufi, A. Loukas, N. Perraudin, and G. Leus, “Forecasting time series with
Process. Mag., vol. 37, no. 6, pp. 43–63, Nov. 2020, doi: 10.1109/MSP.2020. VARMA recursions on graphs,” IEEE Trans. Signal Process., vol. 67, no. 18, pp.
3015024. 4870–4885, Sep. 2019, doi: 10.1109/TSP.2019.2929930.
[26] S. Segarra, A. G. Marques, and A. Ribeiro, “Optimal graph-filter design and [49] J. Mei and J. M. F. Moura, “Signal processing on graphs: Causal modeling of
applications to distributed linear network operators,” IEEE Trans. Signal Process., unstructured data,” IEEE Trans. Signal Process., vol. 65, no. 8, pp. 2077–2092,
vol. 65, no. 15, pp. 4117–4131, Apr. 2017, doi: 10.1109/TSP.2017.2703660. Apr. 2017, doi: 10.1109/TSP.2016.2634543.
[27] E. Isufi, A. Loukas, A. Simonetto, and G. Leus, “Autoregressive moving aver- [50] G. B. Giannakis, Y. Shen, and G. V. Karanikolas, “Topology identification and
age graph filtering,” IEEE Trans. Signal Process., vol. 65, no. 2, pp. 274–288, Jan. learning over graphs: Accounting for nonlinearities and dynamics,” Proc. IEEE, vol.
2017, doi: 10.1109/TSP.2016.2614793. 106, no. 5, pp. 787–807, May 2018, doi: 10.1109/JPROC.2018.2804318.
[28] S. Chen, R. Varma, A. Sandryhaila, and J. Kovačević, “Discrete signal pro- [51] R. P. Monti, P. Hellyer, D. Sharp, R. Leech, C. Anagnostopoulos, and G.
cessing on graphs: Sampling theory,” IEEE Trans. Signal Process., vol. 63, no. 24, Montana, “Estimating time-varying brain connectivity networks from functional
pp. 6510–6523, Dec. 2015, doi: 10.1109/TSP.2015.2469645. MRI time series,” NeuroImage, vol. 103, pp. 427–443, Dec. 2014, doi: 10.1016/j.
[29] X. Dong, D. Thanou, P. Frossard, and P. Vandergheynst, “Learning laplacian neuroimage.2014.07.033.
matrix in smooth graph signal representations,” IEEE Trans. Signal Process., vol. [52] M. G. Preti, T. A. W. Bolton, and D. Van De Ville, “The dynamic functional
64, no. 23, pp. 6160–6173, Dec. 2016, doi: 10.1109/TSP.2016.2602809. connectome: State-of-the-art and perspectives,” NeuroImage, vol. 160, pp. 41–54,
[30] G. Mateos, S. Segarra, A. G. Marques, and A. Ribeiro, “Connecting the dots: Oct. 2017, doi: 10.1016/j.neuroimage.2016.12.061.
Identifying network structure via graph signal processing,” IEEE Signal Process. [53] W. Huang, T. A. W. Bolton, J. D. Medaglia, D. S. Bassett, A. Ribeiro, and D.
Mag., vol. 36, no. 3, pp. 16–43, May 2019, doi: 10.1109/MSP.2018.2890143. Van De Ville, “A graph signal processing perspective on functional brain imaging,”
[31] A. G. Marques, S. Segarra, G. Leus, and A. Ribeiro, “Stationary graph pro- Proc. IEEE, vol. 106, no. 5, pp. 868–885, May 2018, doi: 10.1109/
cesses and spectral estimation,” IEEE Trans. Signal Process., vol. 65, no. 22, pp. JPROC.2018.2798928.
5911–5926, Nov. 2017, doi: 10.1109/TSP.2017.2739099. [54] J. D. Medaglia et al., “Functional alignment with anatomical networks is asso-
[32] N. Perraudin and P. Vandergheynst, “Stationary signal processing on graphs,” ciated with cognitive flexibility,” Nature Human Behav., vol. 2, no. 2, pp. 156–164,
IEEE Trans. Signal Process., vol. 65, no. 13, pp. 3462–3477, Jul. 2017, doi: Feb. 2018, doi: 10.1038/s41562-017-0260-9.
10.1109/TSP.2017.2690388. [55] G. Cheung, E. Magli, Y. Tanaka, and M. K. Ng, “Graph spectral image pro-
[33] S. Segarra, G. Mateos, A. G. Marques, and A. Ribeiro, “Blind identification of cessing,” Proc. IEEE, vol. 106, no. 5, pp. 907–930, May 2018, doi: 10.1109/
graph filters,” IEEE Trans. Signal Process., vol. 65, no. 5, pp. 1146–1159, Mar. JPROC.2018.2799702.
2017, doi: 10.1109/TSP.2016.2628343. [56] W. Hu, G. Cheung, A. Ortega, and O. C. Au, “Multiresolution graph Fourier
[34] Y. Tanaka, Y. C. Eldar, A. Ortega, and G. Cheung, “Sampling signals on transform for compression of piecewise smooth images,” IEEE Trans. Image
graphs: From theory to applications,” IEEE Signal Process. Mag., vol. 37, no. 6, pp. Process., vol. 24, no. 1, pp. 419–433, Jan. 2015, doi: 10.1109/TIP.2014.2378055.
14–30, Nov. 2020, doi: 10.1109/MSP.2020.3016908. [57] P. Milanfar, “A tour of modern image filtering: New insights and methods, both
[35] X. Dong, D. Thanou, M. Rabbat, and P. Frossard, “Learning graphs from data: practical and theoretical,” IEEE Signal Process. Mag., vol. 30, no. 1, pp. 106–128,
A signal representation perspective,” IEEE Signal Process. Mag., vol. 36, no. 3, pp. Jan. 2013, doi: 10.1109/MSP.2011.2179329.
44–63, May 2019, doi: 10.1109/MSP.2018.2887284. [58] R. L. De Queiroz and P. A. Chou, “Compression of 3D point clouds using a
[36] S. Segarra, A. G. Marques, G. Mateos, and A. Ribeiro, “Network topology region-adaptive hierarchical transform,” IEEE Trans. Image Process., vol. 25, no. 8,
inference from spectral templates,” IEEE Trans. Signal Inf. Process. Netw., vol. 3, pp. 3947–3956, Aug. 2016, doi: 10.1109/TIP.2016.2575005.
no. 3, pp. 467–483, Sep. 2017, doi: 10.1109/TSIPN.2017.2731051. [59] D. Valsesia, G. Fracastoro, and E. Magli, “Deep graph-convolutional image
[37] B. Pasdeloup, V. Gripon, G. Mercier, D. Pastor, and M. G. Rabbat, denoising,” IEEE Trans. Image Process., vol. 29, pp. 8226–8237, Aug. 2020, doi:
“Characterization and inference of graph diffusion processes from observations of 10.1109/TIP.2020.3013166.
stationary signals,” IEEE Trans. Signal Inf. Process. Netw., vol. 4, no. 3, pp. 481– [60] M. T. Schaub, Y. Zhu, J.-B. Seby, T. M. Roddenberry, and S. Segarra, “Signal
496, Sep. 2018, doi: 10.1109/TSIPN.2017.2742940. processing on higher-order networks: Livin’ on the edge… and beyond,” Signal
[38] N. Tremblay and P. Borgnat, “Graph wavelets for multiscale community min- Process., vol. 187, Oct. 2021, Art. no. 108149, doi: 10.1016/j.sigpro.2021.108149.
ing,” IEEE Trans. Signal Process., vol. 62, no. 20, pp. 5227–5239, Oct. 2014, doi: [61] F. Battiston et al., “Networks beyond pairwise interactions: Structure and
10.1109/TSP.2014.2345355. dynamics,” Phys. Rep., vol. 874, pp. 1–92, Aug. 2020, doi: 10.1016/j.physrep.
[39] N. Tremblay and A. Loukas, “Approximating spectral clustering via sampling: 2020.05.004.
A review,” in Sampling Techniques for Supervised or Unsupervised Tasks,  SP

60 limited to: INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR.


d licensed use IEEE SIGNAL PROCESSING MAGAZINE
Downloaded | June 2023
on November | at 11:59:24 UTC from IEEE Xplore. Restrictions apply.
07,2024

You might also like