Distributed Architecture of Subspace Clustering and Related
Sparse Subspace Clustering
Low-Rank Representation
Least Squares Regression
Multiview Subspace Clustering
Brief Introduction About Topological Interference Management (TIM)Pei-Che Chang
This document discusses topological interference management (TIM) techniques for interference channels. TIM exploits interference alignment principles under realistic channel state information assumptions. The key ideas are:
- Focus on canceling strong interference links based on knowledge of the interference pattern
- There is a connection between TIM and the index coding problem
- The goal of TIM is to maximize degrees of freedom (DoF) based on network topology information
- Examples show how transmitting signals over multiple channel uses and exploiting the interference pattern can achieve different DoF values through interference alignment
Nonconvex Compressed Sensing with the Sum-of-Squares MethodTasuku Soma
This document presents a method for nonconvex compressed sensing using the sum-of-squares (SoS) method. It formulates q-minimization, which requires fewer samples than l1-minimization but is nonconvex, as a polynomial optimization problem. The SoS method is then applied to obtain a pseudoexpectation operator satisfying a pseudo robust null space property, guaranteeing stable signal recovery. Specifically, it shows that for a Rademacher measurement matrix, with the number of measurements scaling quadratically in the sparsity s, the SoS method finds a solution x^ satisfying ||x^-x||_q ≤ O(σs(x)q) + ε, providing nearly q-stable recovery.
The low-rank basis problem for a matrix subspaceTasuku Soma
This document summarizes a presentation on finding low-rank bases for matrix subspaces. It introduces the low-rank basis problem, describes a greedy algorithm to solve it using two phases - rank estimation and alternating projection, and proves local convergence guarantees for the algorithm. Experimental results on synthetic and image data demonstrate the algorithm can recover known low-rank bases and separate mixed images. Comparisons are made to tensor decomposition methods for the special case of rank-1 bases.
Regret Minimization in Multi-objective Submodular Function MaximizationTasuku Soma
This document presents algorithms for minimizing regret ratio in multi-objective submodular function maximization. It introduces the concept of regret ratio for evaluating the quality of a solution set for multiple objectives. It then proposes two algorithms, Coordinate and Polytope, that provide upper bounds on regret ratio by leveraging approximation algorithms for single objective problems. Experimental results on a movie recommendation dataset show the proposed algorithms achieve significantly lower regret ratios than a random baseline.
Calculus AB - Slope of secant and tangent linesKenyon Hundley
The document provides information about calculating slopes of secant and tangent lines for non-linear functions, including formulas and examples worked through step-by-step. It contains two examples of finding (1) the slope of a secant line between two points, (2) the slope of a tangent line at a single point, and (3) the equation of the tangent line. The slopes and equations are derived using the given formulas and by plugging values into the example functions.
- The document discusses representation of stochastic processes in real and spectral domains and Monte Carlo sampling.
- Stochastic processes can be represented in the real (time or space) domain using autocorrelation and variogram functions, and in the spectral domain using power spectral density functions.
- Monte Carlo sampling uses techniques to generate random numbers from a probability density function for random sampling.
Intelligent Process Control Using Neural Fuzzy Techniques ~陳奇中教授演講投影片Chyi-Tsong Chen
The document discusses the design of a neural fuzzy control system using neural networks and fuzzy logic to control complex chemical processes. It presents the structure of a neural fuzzy controller with 4 layers that implement fuzzification, rule evaluation, and defuzzification. The neural fuzzy control system is designed to have learning abilities to automate the design of fuzzy logic control systems.
This document discusses the application of contour integration in complex analysis. It begins by defining line integrals in the complex plane and establishing the equivalence between complex and real line integrals. An example is then provided to demonstrate evaluating a line integral around a circle using contour integration. The key results shown are that for a function f(z) that is analytic within and on a simple closed contour C, the line integral is equal to 2πi times the sum of the residues of f(z) inside C. This technique of contour integration is noted to have applications in fields such as oceanography, geology, environmental science, statistics, and electrostatics.
1) The document discusses using the residue theorem to evaluate a complex contour integral to calculate the Laplace transform of the output of an ideal sampler. This provides a closed-form solution that is less painful than the infinite series form.
2) An ideal sampler can be modeled as a carrier signal modulated by the input signal. The output of the sampler is then sent to a zero-order hold.
3) By choosing an appropriate contour, the complex contour integral can be evaluated using the residue theorem. This gives the Laplace transform of the sampler output in terms of the residues of the integrand's poles.
This document discusses stochastic models for site characterization. It describes several continuous models for generating random fields including the multivariate normal method, LU decomposition method, and turning bands method. The multivariate normal method models a random vector as having a multivariate normal distribution defined by a mean vector and covariance matrix. The LU decomposition method generates a random field with a given covariance structure by decomposing the covariance matrix into lower and upper triangular matrices. It provides numerical examples of applying the LU decomposition method to generate correlated random variables at two points.
What is meaning of epsilon and delta in limits of a function by Arun Umraossuserd6b1fd
The document defines the limit of a function. It states that a function f(x) is defined on a bounded domain R and the limit is evaluated at a point x0 as x approaches x0 from the positive side. It defines ε and δ such that if |f(x) - f(x0)| < ε when 0 < |x - x0| < δ, then the limit of f(x) as x approaches x0 is f(x0). It provides an example of the limit of f(x) = 1/x as x approaches 0 and explains that the limit does not exist since ε is undefined.
The document discusses the one-sided z-transform, which is used to solve difference equations with initial conditions. It defines the one-sided z-transform and explains that it contains information only about causal signals that are zero for negative time values. Some key properties are described, including shifting properties for time delay and advance, and the final value theorem. An example shows how to use the one-sided z-transform to solve a simple difference equation.
Numerical integration based on the hyperfunction theoryHidenoriOgata
The document discusses a numerical integration method based on the hyperfunction theory. The method represents integrals, including those with singularities, as contour integrals in the complex plane. For integrals over a finite interval, the contour integral is approximated using the trapezoidal rule. For integrals over an infinite interval, the contour is parameterized and the integral is evaluated as an infinite sum, which is accelerated using the DE transform. The method is highly accurate due to the geometric convergence of the trapezoidal rule for analytic functions.
This document outlines an introduction to convex optimization. It begins with an introduction stating that convex optimization problems can be solved efficiently to find the global optimum. It then provides an outline covering convex sets, convex functions, convex optimization problems, and references. The body of the document defines convex sets as sets where a line segment between any two points lies entirely within the set. It also provides examples of convex sets including norm balls and intersections of convex sets. It defines convex functions as functions where the graph lies below any line segment between two points, and provides conditions for checking convexity using derivatives. Finally, it discusses convex optimization problems and solving them efficiently.
The document describes the Newton-Raphson method for finding the roots of nonlinear equations. It provides the derivation of the method, outlines the algorithm as a 3-step process, and gives an example of applying it to find the depth a floating ball submerges in water. The advantages are that it converges fast if it converges and requires only one initial guess. Drawbacks include potential issues with division by zero, root jumping, and oscillations near local extrema.
The document provides information about Lagrangian interpolation, including:
1. It introduces Lagrangian interpolation as a method to find the value of a function at a discrete point using a polynomial that passes through known data points.
2. It gives the formula for the Lagrangian interpolating polynomial and provides an example of using it to find the velocity of a rocket at a certain time.
3. It discusses using higher order polynomials for interpolation, providing another example that calculates velocity using quadratic and cubic polynomials.
Random Matrix Theory and Machine Learning - Part 3Fabian Pedregosa
ICML 2021 tutorial on random matrix theory and machine learning.
Part 3 covers: 1. Motivation: Average-case versus worst-case in high dimensions 2. Algorithm halting times (runtimes) 3. Outlook
Random Matrix Theory and Machine Learning - Part 4Fabian Pedregosa
Deep learning models with millions or billions of parameters should overfit according to classical theory, but they do not. The emerging theory of double descent seeks to explain why larger neural networks can generalize well. Random matrix theory provides a tractable framework to model double descent through random feature models, where the number of random features controls model capacity. In the high-dimensional limit, the test error of random feature regression exhibits a double descent shape that can be computed analytically.
Computer Oriented Numerical Analysis
What is interpolation?
Many times, data is given only at discrete points such as .
So, how then does one find the value of y at any other value of x ?
Well, a continuous function f(x) may be used to represent the data values with f(x) passing through the points (Figure 1). Then one can find the value of y at any other value of x .
This is called interpolation
Newton’s Divided Difference Formula:
To illustrate this method, linear and quadratic interpolation is presented first.
Then, the general form of Newton’s divided difference polynomial method is presented.
Random Matrix Theory and Machine Learning - Part 1Fabian Pedregosa
This document provides an introduction to random matrix theory and its applications in machine learning. It discusses several classical random matrix ensembles like the Gaussian Orthogonal Ensemble (GOE) and Wishart ensemble. These ensembles are used to model phenomena in fields like number theory, physics, and machine learning. Specifically, the GOE is used to model Hamiltonians of heavy nuclei, while the Wishart ensemble relates to the Hessian of least squares problems. The tutorial will cover applications of random matrix theory to analyzing loss landscapes, numerical algorithms, and the generalization properties of machine learning models.
This document discusses rules for integrating exponential, trigonometric, and other functions. It provides examples of using u-substitution to find integrals of functions involving exponentials, trigonometric functions, and other substitutions. The document also reviews rules of differentiation and discusses how indefinite integrals result from trigonometric identities and u-substitution techniques.
This document discusses Fourier series and related concepts. It provides definitions and formulas for general Fourier series, Fourier series for discontinuous functions, the change of interval method, Fourier series for even and odd functions, and half range Fourier cosine and sine series. Examples of applications of these Fourier series concepts and techniques are also presented.
Probability formula sheet
Set theory, sample space, events, concepts of randomness and uncertainty, basic principles of probability, axioms and properties of probability, conditional probability, independent events, Baye’s formula, Bernoulli trails, sequential experiments, discrete and continuous random variable, distribution and density functions, one and two dimensional random variables, marginal and joint distributions and density functions. Expectations, probability distribution families (binomial, poisson, hyper geometric, geometric distribution, normal, uniform and exponential), mean, variance, standard deviations, moments and moment generating functions, law of large numbers, limits theorems
for more visit http://tricntip.blogspot.com/
This document discusses several topics related to calculus including:
1) Derivatives of position, velocity, and acceleration and how they relate to each other.
2) An example problem calculating velocity from a position function.
3) The Mean Value Theorem and how to apply it to find critical points of a function.
4) How the first and second derivatives of a function relate to critical points, maxima, minima, and points of inflection or concavity.
5) Related rates problems and how to set them up using derivatives and relationships between variables.
This document provides information about a formula sheet published by the Institute of Engineering Studies in Bangalore, India. The formula sheet contains formulas and concepts for 12 engineering topics from mathematics to power electronics. It also lists contact information for the publisher and provides details about online and offline services available to exam preparation aspirants, including classroom coaching, study materials, practice question books, and access to an online test portal.
This document discusses numerical integration and interpolation formulas. It begins by explaining the general formula for numerical integration using equidistant values of a function f(x) between bounds a and b. It then derives Trapezoidal, Simpson's, and Weddle's rules by putting different values for n in the general formula. The document also discusses Newton's forward and backward interpolation formulas, Lagrange interpolation formula, and provides examples of their application. It concludes by comparing Lagrange and Newton interpolation and discussing uses of interpolation in computer science and engineering fields.
This document discusses deep generative models including variational autoencoders (VAEs) and generational adversarial networks (GANs). It explains that generative models learn the distribution of input data and can generate new samples from that distribution. VAEs use variational inference to learn a latent space and generate new data by varying the latent variables. The document outlines the key concepts of VAEs including the evidence lower bound objective used for training and how it maximizes the likelihood of the data.
Spectral clustering is a technique for clustering data points into groups using the spectrum (eigenvalues and eigenvectors) of the similarity matrix of the points. It works by constructing a graph from the pairwise similarities of points, calculating the Laplacian of the graph, and using the k eigenvectors of the Laplacian corresponding to the smallest eigenvalues to embed the points into a k-dimensional space. K-means clustering is then applied to the embedded points to obtain the final clustering. The document discusses two basic spectral clustering algorithms that differ in whether they use the normalized or unnormalized Laplacian.
Spectral clustering algorithms represent data as a weighted graph and cluster points based on the eigenvectors of matrices derived from the graph such as the Laplacian matrix. The algorithms involve constructing a matrix representation of the dataset, computing the eigenvalues and eigenvectors of the matrix to map points to a lower dimensional space, and then grouping points based on their mapping. Specifically, the algorithm maps points to components of the second eigenvector (Fiedler vector) of the Laplacian matrix to partition the graph into two clusters that minimize the cut between them.
1) The document discusses using the residue theorem to evaluate a complex contour integral to calculate the Laplace transform of the output of an ideal sampler. This provides a closed-form solution that is less painful than the infinite series form.
2) An ideal sampler can be modeled as a carrier signal modulated by the input signal. The output of the sampler is then sent to a zero-order hold.
3) By choosing an appropriate contour, the complex contour integral can be evaluated using the residue theorem. This gives the Laplace transform of the sampler output in terms of the residues of the integrand's poles.
This document discusses stochastic models for site characterization. It describes several continuous models for generating random fields including the multivariate normal method, LU decomposition method, and turning bands method. The multivariate normal method models a random vector as having a multivariate normal distribution defined by a mean vector and covariance matrix. The LU decomposition method generates a random field with a given covariance structure by decomposing the covariance matrix into lower and upper triangular matrices. It provides numerical examples of applying the LU decomposition method to generate correlated random variables at two points.
What is meaning of epsilon and delta in limits of a function by Arun Umraossuserd6b1fd
The document defines the limit of a function. It states that a function f(x) is defined on a bounded domain R and the limit is evaluated at a point x0 as x approaches x0 from the positive side. It defines ε and δ such that if |f(x) - f(x0)| < ε when 0 < |x - x0| < δ, then the limit of f(x) as x approaches x0 is f(x0). It provides an example of the limit of f(x) = 1/x as x approaches 0 and explains that the limit does not exist since ε is undefined.
The document discusses the one-sided z-transform, which is used to solve difference equations with initial conditions. It defines the one-sided z-transform and explains that it contains information only about causal signals that are zero for negative time values. Some key properties are described, including shifting properties for time delay and advance, and the final value theorem. An example shows how to use the one-sided z-transform to solve a simple difference equation.
Numerical integration based on the hyperfunction theoryHidenoriOgata
The document discusses a numerical integration method based on the hyperfunction theory. The method represents integrals, including those with singularities, as contour integrals in the complex plane. For integrals over a finite interval, the contour integral is approximated using the trapezoidal rule. For integrals over an infinite interval, the contour is parameterized and the integral is evaluated as an infinite sum, which is accelerated using the DE transform. The method is highly accurate due to the geometric convergence of the trapezoidal rule for analytic functions.
This document outlines an introduction to convex optimization. It begins with an introduction stating that convex optimization problems can be solved efficiently to find the global optimum. It then provides an outline covering convex sets, convex functions, convex optimization problems, and references. The body of the document defines convex sets as sets where a line segment between any two points lies entirely within the set. It also provides examples of convex sets including norm balls and intersections of convex sets. It defines convex functions as functions where the graph lies below any line segment between two points, and provides conditions for checking convexity using derivatives. Finally, it discusses convex optimization problems and solving them efficiently.
The document describes the Newton-Raphson method for finding the roots of nonlinear equations. It provides the derivation of the method, outlines the algorithm as a 3-step process, and gives an example of applying it to find the depth a floating ball submerges in water. The advantages are that it converges fast if it converges and requires only one initial guess. Drawbacks include potential issues with division by zero, root jumping, and oscillations near local extrema.
The document provides information about Lagrangian interpolation, including:
1. It introduces Lagrangian interpolation as a method to find the value of a function at a discrete point using a polynomial that passes through known data points.
2. It gives the formula for the Lagrangian interpolating polynomial and provides an example of using it to find the velocity of a rocket at a certain time.
3. It discusses using higher order polynomials for interpolation, providing another example that calculates velocity using quadratic and cubic polynomials.
Random Matrix Theory and Machine Learning - Part 3Fabian Pedregosa
ICML 2021 tutorial on random matrix theory and machine learning.
Part 3 covers: 1. Motivation: Average-case versus worst-case in high dimensions 2. Algorithm halting times (runtimes) 3. Outlook
Random Matrix Theory and Machine Learning - Part 4Fabian Pedregosa
Deep learning models with millions or billions of parameters should overfit according to classical theory, but they do not. The emerging theory of double descent seeks to explain why larger neural networks can generalize well. Random matrix theory provides a tractable framework to model double descent through random feature models, where the number of random features controls model capacity. In the high-dimensional limit, the test error of random feature regression exhibits a double descent shape that can be computed analytically.
Computer Oriented Numerical Analysis
What is interpolation?
Many times, data is given only at discrete points such as .
So, how then does one find the value of y at any other value of x ?
Well, a continuous function f(x) may be used to represent the data values with f(x) passing through the points (Figure 1). Then one can find the value of y at any other value of x .
This is called interpolation
Newton’s Divided Difference Formula:
To illustrate this method, linear and quadratic interpolation is presented first.
Then, the general form of Newton’s divided difference polynomial method is presented.
Random Matrix Theory and Machine Learning - Part 1Fabian Pedregosa
This document provides an introduction to random matrix theory and its applications in machine learning. It discusses several classical random matrix ensembles like the Gaussian Orthogonal Ensemble (GOE) and Wishart ensemble. These ensembles are used to model phenomena in fields like number theory, physics, and machine learning. Specifically, the GOE is used to model Hamiltonians of heavy nuclei, while the Wishart ensemble relates to the Hessian of least squares problems. The tutorial will cover applications of random matrix theory to analyzing loss landscapes, numerical algorithms, and the generalization properties of machine learning models.
This document discusses rules for integrating exponential, trigonometric, and other functions. It provides examples of using u-substitution to find integrals of functions involving exponentials, trigonometric functions, and other substitutions. The document also reviews rules of differentiation and discusses how indefinite integrals result from trigonometric identities and u-substitution techniques.
This document discusses Fourier series and related concepts. It provides definitions and formulas for general Fourier series, Fourier series for discontinuous functions, the change of interval method, Fourier series for even and odd functions, and half range Fourier cosine and sine series. Examples of applications of these Fourier series concepts and techniques are also presented.
Probability formula sheet
Set theory, sample space, events, concepts of randomness and uncertainty, basic principles of probability, axioms and properties of probability, conditional probability, independent events, Baye’s formula, Bernoulli trails, sequential experiments, discrete and continuous random variable, distribution and density functions, one and two dimensional random variables, marginal and joint distributions and density functions. Expectations, probability distribution families (binomial, poisson, hyper geometric, geometric distribution, normal, uniform and exponential), mean, variance, standard deviations, moments and moment generating functions, law of large numbers, limits theorems
for more visit http://tricntip.blogspot.com/
This document discusses several topics related to calculus including:
1) Derivatives of position, velocity, and acceleration and how they relate to each other.
2) An example problem calculating velocity from a position function.
3) The Mean Value Theorem and how to apply it to find critical points of a function.
4) How the first and second derivatives of a function relate to critical points, maxima, minima, and points of inflection or concavity.
5) Related rates problems and how to set them up using derivatives and relationships between variables.
This document provides information about a formula sheet published by the Institute of Engineering Studies in Bangalore, India. The formula sheet contains formulas and concepts for 12 engineering topics from mathematics to power electronics. It also lists contact information for the publisher and provides details about online and offline services available to exam preparation aspirants, including classroom coaching, study materials, practice question books, and access to an online test portal.
This document discusses numerical integration and interpolation formulas. It begins by explaining the general formula for numerical integration using equidistant values of a function f(x) between bounds a and b. It then derives Trapezoidal, Simpson's, and Weddle's rules by putting different values for n in the general formula. The document also discusses Newton's forward and backward interpolation formulas, Lagrange interpolation formula, and provides examples of their application. It concludes by comparing Lagrange and Newton interpolation and discussing uses of interpolation in computer science and engineering fields.
This document discusses deep generative models including variational autoencoders (VAEs) and generational adversarial networks (GANs). It explains that generative models learn the distribution of input data and can generate new samples from that distribution. VAEs use variational inference to learn a latent space and generate new data by varying the latent variables. The document outlines the key concepts of VAEs including the evidence lower bound objective used for training and how it maximizes the likelihood of the data.
Spectral clustering is a technique for clustering data points into groups using the spectrum (eigenvalues and eigenvectors) of the similarity matrix of the points. It works by constructing a graph from the pairwise similarities of points, calculating the Laplacian of the graph, and using the k eigenvectors of the Laplacian corresponding to the smallest eigenvalues to embed the points into a k-dimensional space. K-means clustering is then applied to the embedded points to obtain the final clustering. The document discusses two basic spectral clustering algorithms that differ in whether they use the normalized or unnormalized Laplacian.
Spectral clustering algorithms represent data as a weighted graph and cluster points based on the eigenvectors of matrices derived from the graph such as the Laplacian matrix. The algorithms involve constructing a matrix representation of the dataset, computing the eigenvalues and eigenvectors of the matrix to map points to a lower dimensional space, and then grouping points based on their mapping. Specifically, the algorithm maps points to components of the second eigenvector (Fiedler vector) of the Laplacian matrix to partition the graph into two clusters that minimize the cut between them.
The document describes a method for collaborative subspace clustering using a deep neural network. The network contains an encoder, a self-expressive layer to learn the affinity matrix C, and a decoder. The network is trained end-to-end by minimizing a loss function containing terms for subspace clustering and collaborative learning between the affinity matrix C and a classifier's output affinity matrix. The loss encourages C to be more confident in identifying points from the same class compared to the classifier.
information theoretic subspace clusteringali hassan
Information Theoretic Subspace Clustering
Ran He, Liang Wang, Zhenan Sun, Yingya Zhang, and Bo Li
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS,
VOL. 27, NO. 12, DECEMBER 2016
Spectral clustering works by creating an affinity matrix from a similarity matrix and then applying dimensionality reduction before clustering in the reduced space. It represents the data as an undirected graph and uses the graph Laplacian matrix to perform the dimensionality reduction. The number of clusters can be determined using the eigengap heuristic or by setting k equal to the logarithm of the number of data points. The Gaussian kernel is commonly used to create the affinity matrix from the similarity matrix.
This document summarizes a semi-supervised regression method that combines graph Laplacian regularization with cluster ensemble methodology. It proposes using a weighted averaged co-association matrix from the cluster ensemble as the similarity matrix in graph Laplacian regularization. The method (SSR-LRCM) finds a low-rank approximation of the co-association matrix to efficiently solve the regression problem. Experimental results on synthetic and real-world datasets show SSR-LRCM achieves significantly better prediction accuracy than an alternative method, while also having lower computational costs for large datasets. Future work will explore using a hierarchical matrix approximation instead of low-rank.
This document discusses spectral clustering, a graph-based machine learning technique. It begins with an overview of data clustering criteria such as compactness and connectivity. It then covers the key concepts of spectral clustering, including constructing a similarity graph from data, calculating the normalized cut criterion, and solving the relaxed optimization problem through eigendecomposition. The document provides pseudocode for a general spectral clustering algorithm and compares it to k-means clustering. It also discusses two common methods for deriving clusters from the eigenvectors and provides examples of spectral clustering for image segmentation.
Semi-supervised spectral clustering using shared nearest neighbor for data wi...IAESIJAI
In the absence of supervisory information in spectral clustering algorithms, it is difficult to construct suitable similarity graphs for data with complex shapes and varying densities. To address this issue, this paper proposes a semi-supervised spectral clustering algorithm based on shared nearest neighbor (SNN). The proposed algorithm combines the idea of semi-supervised clustering, adding SNN to the calculation of the distance matrix, and using pairwise constraint information to find the relationship between two data points, while providing a portion of supervised information. Comparative experiments were conducted on artificial data sets and University of California Irvine machine learning repository datasets. The experimental results show that the proposed algorithm achieves better clustering results compared to traditional K-means and spectral clustering algorithms.
Notes taken while reading the paper "A Tutorial on Spectral Clustering" by Ulrike von Luxburg. Find original paper at http://www.informatik.uni-hamburg.de/ML/contents/people/luxburg/publications/Luxburg07_tutorial.pdf
Topics In Theoretical Computer Science An Algorithmists Toolkit Lecture Notes...fayatbromo
Topics In Theoretical Computer Science An Algorithmists Toolkit Lecture Notes Prof Jonathan Kelner
Topics In Theoretical Computer Science An Algorithmists Toolkit Lecture Notes Prof Jonathan Kelner
Topics In Theoretical Computer Science An Algorithmists Toolkit Lecture Notes Prof Jonathan Kelner
This document provides an introduction to spectral graph theory. It discusses how spectral graph theory connects combinatorics and algebra through studying graphs using eigenvalues and eigenvectors of adjacency matrices. It covers applications of spectral graph theory such as spectral clustering, which uses eigenvectors of the graph Laplacian as features for clustering nodes, and graph convolutional networks, which apply graph filtering and node-wise transformations to classify nodes in a graph.
Recent advances on low-rank and sparse decomposition for moving object detectionAndrews Cordolino Sobral
(RFIA 2016) Recent advances on low-rank and sparse decomposition for moving object detection: matrix and tensor-based approaches. RFIA 2016, workshop/atelier: Enjeux dans la détection d’objets mobiles par soustraction de fond.
This document provides an overview of spectral clustering. It begins with a review of clustering and introduces the similarity graph and graph Laplacian. It then describes the spectral clustering algorithm and interpretations from the perspectives of graph cuts, random walks, and perturbation theory. Practical details like constructing the similarity graph, computing eigenvectors, choosing the number of clusters, and which graph Laplacian to use are also discussed. The document aims to explain the mathematical foundations and intuitions behind spectral clustering.
State-of-the-art Clustering Techniques: Support Vector Methods and Minimum Br...Vincenzo Russo
This document summarizes two state-of-the-art clustering techniques: Support Vector Clustering (SVC) and Bregman Co-clustering. SVC involves a two-phase process: 1) determining boundaries of clusters using a minimum enclosing ball approach, and 2) assigning cluster labels by finding connected components in a graph. Bregman Co-clustering aims to robustly handle missing or sparse data, high dimensionality, noise and outliers. The document discusses applications and desirable properties of these clustering methods, such as nonlinear separability and automatic detection of the number of clusters.
Non-Negative Local Sparse Coding for Subspace Clusteringbabak hosseini
This document proposes a new subspace clustering method called Non-Negative Local Sparse Coding (NLSSC). NLSSC incorporates low-rank sparseness, non-negativity, and local separation to better represent the global data structure. It is optimized using an alternating direction method of multipliers. A post-processing step called link-restore is also introduced to correct broken links in the sparse representation graph. Experimental results on image and video datasets show NLSSC outperforms other state-of-the-art subspace clustering methods. The method is also extended to kernels for implicit nonlinear feature mapping.
This document proposes using spectral clustering based on the normalized graph Laplacian spectrum to solve problems in community detection and handwritten digit recognition. It summarizes the key concepts in graph signal processing and introduces spectral clustering. The paper provides a mathematical proof that the signs of the second eigenvector components of the normalized graph Laplacian can accurately partition a graph into two communities. It then applies this spectral clustering method to community detection and digit recognition, comparing results to other popular algorithms to demonstrate the advantages of the spectral clustering approach.
Part of the course "Algorithmic Methods of Data Science". Sapienza University of Rome, 2015.
http://aris.me/index.php/data-mining-ds-2015
IRJET- Survey on Image Denoising AlgorithmsIRJET Journal
This document discusses image denoising algorithms and tensor decomposition methods for noise removal. It begins with an introduction to image denoising and different types of noise such as Gaussian, impulse, uniform, and periodic noise. It then describes common image denoising approaches such as spatial filtering and transform domain filtering. Specific algorithms discussed include corner-based filtering, block-based filtering, and tensor decomposition methods like canonical polyadic decomposition and Tucker decomposition. The document provides an overview of these techniques and compares their abilities to remove noise while preserving image detail and structure.
1) Biased Normalized Cuts presents a modification of Normalized Cuts that incorporates priors to allow for constrained image segmentation.
2) It seeks solutions that are sufficiently "correlated" with noisy top-down priors, like an object detector, and can be computed quickly given the unconstrained solution.
3) The algorithm constructs a "biased normalized cut vector" that linearly combines eigenvectors such that those correlated with a user-specified seed vector are upweighted while inversely correlated ones have their sign flipped.
1) A phase-locked loop (PLL) is a control system that generates an output signal whose phase is related to the phase of an input signal, allowing it to synchronize signals or generate a frequency that is a multiple of the input frequency.
2) In a simple PLL, a phase detector (PD) converts the phase difference between the input and a voltage-controlled oscillator (VCO) output to a voltage, which changes the VCO frequency to follow the input.
3) Ripple in the control voltage to the VCO can produce side bands, so a low-pass filter is used to fix this voltage ripple problem and improve stability.
Probabilistic Matrix Factorization (PMF)
Bayesian Probabilistic Matrix Factorization (BPMF) using
Markov Chain Monte Carlo (MCMC)
BPMF using MCMC – Overall Model
BPMF using MCMC – Gibbs Sampling
1) The document presents the Low-Rank Regularized Heterogeneous Tensor Decomposition (LRRHTD) method for subspace clustering. LRRHTD seeks orthogonal projection matrices for all but the last tensor mode, and a low-rank projection matrix imposed with nuclear norm for the last mode, to obtain the lowest rank representation that reveals global sample structure for clustering.
2) LRRHTD models an Mth-order tensor dataset as a (M+1)th-order tensor by concatenating individual samples. It aims to find M orthogonal factor matrices for intrinsic representation and the lowest rank representation using the mapped low-dimensional tensor as a dictionary.
3) LRRHTD formulates an
This document discusses patch antennas. It describes the basic structure of a patch antenna, which consists of a radiating metallic patch on a dielectric substrate with a ground plane on the other side. Patch antennas radiate a linearly polarized wave and have a very low profile. Their primary limitation is narrow bandwidth, which is typically less than 5% for single-substrate designs. Common patch antenna geometries include rectangular and circular shapes to generate different beam patterns.
This document discusses various topics related to antenna fundamentals including:
1. It defines key antenna terminology such as radiation patterns, beamwidth, directivity, gain, polarization, and more.
2. It describes different categories of antenna types including loops, dipoles, slots, reflectors, patches, and more.
3. It covers antenna parameters and concepts such as radiation patterns, beam efficiency, radiation intensity, effective aperture, polarization, near and far field zones, and more.
This document discusses peak-to-average power ratio (PAPR) reduction techniques for orthogonal frequency-division multiplexing (OFDM) signals. It begins with an introduction to PAPR and its causes for OFDM signals. It then outlines various PAPR reduction techniques including clipping, coding, probabilistic/scrambling, predistortion, and DFT-spreading. Each technique has benefits but also cons such as distortion, reduced efficiency, or increased complexity. The document provides analysis of PAPR characteristics for different OFDM parameters and modulation schemes.
This document discusses various channel estimation techniques for OFDM systems. It describes pilot structures like block, comb and lattice types and how they are suited for different channel conditions. It also explains training symbol based channel estimation techniques like LS and MMSE. DFT-based channel estimation aims to improve performance by eliminating noise outside the channel delay. Decision directed channel estimation updates the channel coefficients without pilots by using detected signal feedback.
This document provides an introduction and overview of orthogonal frequency division multiplexing (OFDM). It discusses the limitations of single-carrier transmission at high data rates due to inter-symbol interference (ISI) and the complexity of equalizers. OFDM is presented as a solution that divides the available bandwidth into multiple orthogonal subcarriers. The key concepts of OFDM covered include cyclic prefix, orthogonality of subcarriers, modulation and demodulation, and how the cyclic prefix mitigates ISI between symbols. Bit error rate simulation of an OFDM system is also demonstrated.
- The document discusses wireless channel propagation and fading. It covers topics like large-scale fading (path loss and shadowing), small-scale fading (time-selective and frequency-selective fading), and statistical characterization of fading channels.
- Small-scale fading is caused by multipath propagation and results in rapid fluctuations in the strength of the received signal over short periods of time or travel distances. It can be time-selective or frequency-selective depending on delay spread and Doppler spread.
- Common distributions for modeling fading amplitudes are Rayleigh for non-line-of-sight environments and Rician when there is a dominant line-of-sight path. The document presents models for generating both Rayleigh and Rician fading
Deterministic MIMO Channel Capacity
• CSI is Known to the Transmitter Side
• CSI is Not Available at the Transmitter Side
Channel Capacity of Random MIMO Channels
Millimeter wave 5G antennas for smartphonesPei-Che Chang
This document describes research on millimeter-wave antennas for 5G smartphones. It discusses several antenna designs for both 60 GHz and 28 GHz applications. For 60 GHz, a 2012 design integrated a 16-element phased array directly into a printed circuit board. Later designs in 2013 and 2017 explored integrating antenna arrays with reconfigurable polarization into mobile device chassis. A 2014 design proposed a 28 GHz mesh-grid patch antenna array for 5G cellular devices, demonstrating an 11 dBi gain array integrated into a Samsung phone. The document outlines various antenna designs, simulation and measurement results to enable millimeter-wave smartphone connectivity.
1) The document discusses the modulation techniques used in various Global Navigation Satellite Systems (GNSS), including GPS, Glonass, BeiDou, and Galileo.
2) GPS uses BPSK-R modulation with a 2.046 MHz bandwidth. Glonass uses FDMA, while the others use CDMA.
3) BOC modulation, used in Galileo, modulates the signal with a subcarrier signal that can be either sine or cosine. This results in a spectral distribution around the subcarrier frequency.
This document discusses intermodulation derivation and fundamental and mth order intermodulation distortion response. It appears to be a technical document about signal processing and distortion, though some of the content is in an unrecognized language so the full details cannot be determined from the provided excerpt.
Boosting MySQL with Vector Search -THE VECTOR SEARCH CONFERENCE 2025 .pdfAlkin Tezuysal
As the demand for vector databases and Generative AI continues to rise, integrating vector storage and search capabilities into traditional databases has become increasingly important. This session introduces the *MyVector Plugin*, a project that brings native vector storage and similarity search to MySQL. Unlike PostgreSQL, which offers interfaces for adding new data types and index methods, MySQL lacks such extensibility. However, by utilizing MySQL's server component plugin and UDF, the *MyVector Plugin* successfully adds a fully functional vector search feature within the existing MySQL + InnoDB infrastructure, eliminating the need for a separate vector database. The session explains the technical aspects of integrating vector support into MySQL, the challenges posed by its architecture, and real-world use cases that showcase the advantages of combining vector search with MySQL's robust features. Attendees will leave with practical insights on how to add vector search capabilities to their MySQL systems.
Mark Zuckerberg teams up with frenemy Palmer Luckey to shape the future of XR...Scott M. Graffius
Mark Zuckerberg teams up with frenemy Palmer Luckey to shape the future of XR/VR/AR wearables 🥽
Drawing on his background in AI, Agile, hardware, software, gaming, and defense, Scott M. Graffius explores the collaboration in “Meta and Anduril’s EagleEye and the Future of XR: How Gaming, AI, and Agile are Transforming Defense.” It’s a powerful case of cross-industry innovation—where gaming meets battlefield tech.
📖 Read the article: https://www.scottgraffius.com/blog/files/meta-and-anduril-eagleeye-and-the-future-of-xr-how-gaming-ai-and-agile-are-transforming-defense.html
#Agile #AI #AR #ArtificialIntelligence #AugmentedReality #Defense #DefenseTech #EagleEye #EmergingTech #ExtendedReality #ExtremeReality #FutureOfTech #GameDev #GameTech #Gaming #GovTech #Hardware #Innovation #Meta #MilitaryInnovation #MixedReality #NationalSecurity #TacticalTech #Tech #TechConvergence #TechInnovation #VirtualReality #XR
Trends Artificial Intelligence - Mary MeekerClive Dickens
Mary Meeker’s 2024 AI report highlights a seismic shift in productivity, creativity, and business value driven by generative AI. She charts the rapid adoption of tools like ChatGPT and Midjourney, likening today’s moment to the dawn of the internet. The report emphasizes AI’s impact on knowledge work, software development, and personalized services—while also cautioning about data quality, ethical use, and the human-AI partnership. In short, Meeker sees AI as a transformative force accelerating innovation and redefining how we live and work.
If You Use Databricks, You Definitely Need FMESafe Software
DataBricks makes it easy to use Apache Spark. It provides a platform with the potential to analyze and process huge volumes of data. Sounds awesome. The sales brochure reads as if it is a can-do-all data integration platform. Does it replace our beloved FME platform or does it provide opportunities for FME to shine? Challenge accepted
DevOps in the Modern Era - Thoughtfully Critical PodcastChris Wahl
https://youtu.be/735hP_01WV0
My journey through the world of DevOps! From the early days of breaking down silos between developers and operations to the current complexities of cloud-native environments. I'll talk about my personal experiences, the challenges we faced, and how the role of a DevOps engineer has evolved.
In this talk, Elliott explores how developers can embrace AI not as a threat, but as a collaborative partner.
We’ll examine the shift from routine coding to creative leadership, highlighting the new developer superpowers of vision, integration, and innovation.
We'll touch on security, legacy code, and the future of democratized development.
Whether you're AI-curious or already a prompt engineering, this session will help you find your rhythm in the new dance of modern development.
soulmaite review - Find Real AI soulmate reviewSoulmaite
Looking for an honest take on Soulmaite? This Soulmaite review covers everything you need to know—from features and pricing to how well it performs as a real AI soulmate. We share how users interact with adult chat features, AI girlfriend 18+ options, and nude AI chat experiences. Whether you're curious about AI roleplay porn or free AI NSFW chat with no sign-up, this review breaks it down clearly and informatively.
Neural representations have shown the potential to accelerate ray casting in a conventional ray-tracing-based rendering pipeline. We introduce a novel approach called Locally-Subdivided Neural Intersection Function (LSNIF) that replaces bottom-level BVHs used as traditional geometric representations with a neural network. Our method introduces a sparse hash grid encoding scheme incorporating geometry voxelization, a scene-agnostic training data collection, and a tailored loss function. It enables the network to output not only visibility but also hit-point information and material indices. LSNIF can be trained offline for a single object, allowing us to use LSNIF as a replacement for its corresponding BVH. With these designs, the network can handle hit-point queries from any arbitrary viewpoint, supporting all types of rays in the rendering pipeline. We demonstrate that LSNIF can render a variety of scenes, including real-world scenes designed for other path tracers, while achieving a memory footprint reduction of up to 106.2x compared to a compressed BVH.
https://arxiv.org/abs/2504.21627
What is Oracle EPM A Guide to Oracle EPM Cloud Everything You Need to KnowSMACT Works
In today's fast-paced business landscape, financial planning and performance management demand powerful tools that deliver accurate insights. Oracle EPM (Enterprise Performance Management) stands as a leading solution for organizations seeking to transform their financial processes. This comprehensive guide explores what Oracle EPM is, its key benefits, and how partnering with the right Oracle EPM consulting team can maximize your investment.
Exploring the advantages of on-premises Dell PowerEdge servers with AMD EPYC processors vs. the cloud for small to medium businesses’ AI workloads
AI initiatives can bring tremendous value to your business, but you need to support your new AI workloads effectively. That means choosing the best possible infrastructure for your needs—and many companies are finding that the cloud isn’t right for them. According to a recent Rackspace survey of IT executives, 69 percent of companies have moved some of their applications on-premises from the cloud, with half of those citing security and compliance as the reason and 44 percent citing cost.
On-premises solutions provide a number of advantages. With full control over your security infrastructure, you can be certain that all compliance requirements remain firmly in the hands of your IT team. Opting for on-premises also gives you the ability to design your infrastructure to the precise needs of that team and your new AI workloads. Depending on the workload, you may also see performance benefits, along with more predictable costs. As you start to build your next AI initiative, consider an on-premises solution utilizing AMD EPYC processor-powered Dell PowerEdge servers.
Domino IQ – Was Sie erwartet, erste Schritte und Anwendungsfällepanagenda
Webinar Recording: https://www.panagenda.com/webinars/domino-iq-was-sie-erwartet-erste-schritte-und-anwendungsfalle/
HCL Domino iQ Server – Vom Ideenportal zur implementierten Funktion. Entdecken Sie, was es ist, was es nicht ist, und erkunden Sie die Chancen und Herausforderungen, die es bietet.
Wichtige Erkenntnisse
- Was sind Large Language Models (LLMs) und wie stehen sie im Zusammenhang mit Domino iQ
- Wesentliche Voraussetzungen für die Bereitstellung des Domino iQ Servers
- Schritt-für-Schritt-Anleitung zur Einrichtung Ihres Domino iQ Servers
- Teilen und diskutieren Sie Gedanken und Ideen, um das Potenzial von Domino iQ zu maximieren
Scaling GenAI Inference From Prototype to Production: Real-World Lessons in S...Anish Kumar
Presented by: Anish Kumar
LinkedIn: https://www.linkedin.com/in/anishkumar/
This lightning talk dives into real-world GenAI projects that scaled from prototype to production using Databricks’ fully managed tools. Facing cost and time constraints, we leveraged four key Databricks features—Workflows, Model Serving, Serverless Compute, and Notebooks—to build an AI inference pipeline processing millions of documents (text and audiobooks).
This approach enables rapid experimentation, easy tuning of GenAI prompts and compute settings, seamless data iteration and efficient quality testing—allowing Data Scientists and Engineers to collaborate effectively. Learn how to design modular, parameterized notebooks that run concurrently, manage dependencies and accelerate AI-driven insights.
Whether you're optimizing AI inference, automating complex data workflows or architecting next-gen serverless AI systems, this session delivers actionable strategies to maximize performance while keeping costs low.
ELNL2025 - Unlocking the Power of Sensitivity Labels - A Comprehensive Guide....Jasper Oosterveld
Sensitivity labels, powered by Microsoft Purview Information Protection, serve as the foundation for classifying and protecting your sensitive data within Microsoft 365. Their importance extends beyond classification and play a crucial role in enforcing governance policies across your Microsoft 365 environment. Join me, a Data Security Consultant and Microsoft MVP, as I share practical tips and tricks to get the full potential of sensitivity labels. I discuss sensitive information types, automatic labeling, and seamless integration with Data Loss Prevention, Teams Premium, and Microsoft 365 Copilot.
Interested in leveling up your JavaScript skills? Join us for our Introduction to TypeScript workshop.
Learn how TypeScript can improve your code with dynamic typing, better tooling, and cleaner architecture. Whether you're a beginner or have some experience with JavaScript, this session will give you a solid foundation in TypeScript and how to integrate it into your projects.
Workshop content:
- What is TypeScript?
- What is the problem with JavaScript?
- Why TypeScript is the solution
- Coding demo
Domino IQ – What to Expect, First Steps and Use Casespanagenda
Webinar Recording: https://www.panagenda.com/webinars/domino-iq-what-to-expect-first-steps-and-use-cases/
HCL Domino iQ Server – From Ideas Portal to implemented Feature. Discover what it is, what it isn’t, and explore the opportunities and challenges it presents.
Key Takeaways
- What are Large Language Models (LLMs) and how do they relate to Domino iQ
- Essential prerequisites for deploying Domino iQ Server
- Step-by-step instructions on setting up your Domino iQ Server
- Share and discuss thoughts and ideas to maximize the potential of Domino iQ
Create Your First AI Agent with UiPath Agent BuilderDianaGray10
Join us for an exciting virtual event where you'll learn how to create your first AI Agent using UiPath Agent Builder. This session will cover everything you need to know about what an agent is and how easy it is to create one using the powerful AI-driven UiPath platform. You'll also discover the steps to successfully publish your AI agent. This is a wonderful opportunity for beginners and enthusiasts to gain hands-on insights and kickstart their journey in AI-powered automation.
AI Agents in Logistics and Supply Chain Applications Benefits and ImplementationChristine Shepherd
AI agents are reshaping logistics and supply chain operations by enabling automation, predictive insights, and real-time decision-making across key functions such as demand forecasting, inventory management, procurement, transportation, and warehouse operations. Powered by technologies like machine learning, NLP, computer vision, and robotic process automation, these agents deliver significant benefits including cost reduction, improved efficiency, greater visibility, and enhanced adaptability to market changes. While practical use cases show measurable gains in areas like dynamic routing and real-time inventory tracking, successful implementation requires careful integration with existing systems, quality data, and strategic scaling. Despite challenges such as data integration and change management, AI agents offer a strong competitive edge, with widespread industry adoption expected by 2025.
2. 2
• Feature extraction and clustering at the same time from incomplete tensors data by using
• Subspace clustering and
• Tensor decomposition/completion
• Subspace clustering models (based on spectral clustering):
• SCC (Sparse Subspace Clustering)
• LRR (Low-Rank Representation), BD-LRR (Block Diagonal LRR)…
• LSR (Least Squares Regression)…
• Tensor decomposition/completion algorithms:
• Convex: ADMM
• Non-convex: Bayesian approach (probabilistic model for tensor CP, Tucker or TT)
• Complexity, convergence…
• Applications:
• Face recognition
• Object/action classification
• Face/gait clustering, multi-view clustering
• TIM (Topology Interference Management)
• Challenge competitor:
• PCA (Principal Components Analysis) ? Non-linear space clustering ? …
• TensorFlow regression model ? or SSC + GAN (Generative Adversarial Networks) ? …
Main Goal
3. 3
Sparse Subspace Clustering
1 1
1
1 1
{ } , linear subspaces, each subspace of dimensions { } , 0 .
noise-free data points { } lie in the union of linear subspaces.
[ ,..., ] [ ,..., ] , is
n n
N D
i i
D N
N n
S n d d D
N n
= =
=
×
< <
∈
≡ = ∈
y
Y y y Y Y Γ Y ℓ
ℓ ℓ ℓ ℓ ℓ
ℓ
i
i ℝ
i ℝ a rank- matrix of the points lie in ,
unknown permutation matrix.
Priori bases of the subspaces and which data points belong to which subspace both unknown.
Subspace Clustering :
N N
d N d S
×
>
∈Γ
ℓ ℓ ℓ ℓ
ℝ
i
i
1 1
finding the number of subspaces, their dimensions,
a basis for each subspace, and the segmentation
Self-expressiveness property
of
:
the data from .
for , 0, [ ,..., ]n
i i i ii i i iNS c c c=∈ = = ≡
Y
y y Yc cℓ ℓi ∪
1
.
There exists a , whose nonzero entries correspond to data points from
the same subspace as . (# nonzero elements dimension of subspace)
In matrix fo
spars
rm : m
e solution
in . .
,
,
i
T
i
s t
=
=
y
C Y
c
YCi 1diag( ) 0, where [ ,..., ] .N N
N
×
= ≡ ∈C C c c ℝ
4. 4
Clustering using Sparse Coefficients
1
Build a weighted graph ( , , )
symmetric non-negative
representing the weights of the edges.
0
, for subspaces .
0
SSC algorithm :
N N
T
n
n
×
=
∈
= + =
W
W similarity matrix
W
W C C W Γ
W
GGGG VVVV EEEEi
ℝ
⋯
i ⋮ ⋱ ⋮
⋯
i
i j
ij ij jiw c c= +
5. 5
Noise and Sparse Outlying Entries
0 0 0 0 0
0
In real-world problems, data are often corrupted by noise and sparse outlying entries.
, sparse outlying entries , noise .
Using the self-expressiveness property
D D
i i i i i i
i c
= + + ∈ ∈
=
y y e z e z
y
i
ℝ ℝ
i 0
0 0 0 0
2
1 1
,
, , .
, diag( ) 0,
where sparse coefficient matrix, sparse outlying matrix, noise matrix.
convex min
2
ij j
j i
i ij j i i i i ij j i i ij jj i j i
j i
z
e F
c c c
λ
λ
≠
≠ ≠
≠
= + + ≡ − ≡ −
= + + =
+ +
y
y y e z e e e z z z
Y YC E Z C
C E Z
C E Z
i
i
1
. . , diag( ) ,
/ , / ; , 1; min max , min max .
Use to build a similarity graph and infer the clustering of data using spectral clustering.
T
z z z e e e z e z i j e j
i ij i j i
s t
λ α µ λ α µ α α µ µ
≠ ≠
= + + =
= = > ≡ ≡
Y YC E Z C 0
y y y
Ci
6. 6
Affine Subspaces
2
1 1
-dimensional affine subspace can be considered as a subset of a ( 1)-dimensional
linear subspace that includes and the origin.
, 1, 0
min . .
2
T
i i i ii
z
e F
d S d
S
c
s t
λ
λ
+
= = =
+ + = +
y Yc 1 c
C E Z Y YC
ℓ ℓ ℓ
ℓ
i
i
i , , diag( ) .
E.g. motion segmentation ( , , , ) problem involves clustering of data that lie in a
union of 3-dimensional affine subspaces.
T T
x y z time
+ = =E Z 1 C 1 C 0
i
• Elhamifar, E., and Vidal, R.: “Sparse subspace clustering: algorithm, theory, and applications”, IEEE Trans. Pattern Anal. Mach. Intell., 2013, 35, (11),
pp. 2765–2781
• R. Vidal, “Subspace clustering,” IEEE Signal Process. Mag., vol. 28, no. 2, pp. 52–68, Mar. 2011.
7. 7
Spectral Clustering - Similarity Graphs
Similarity graph ( , ), vertex in graph represents a data point ,
the edge is weighted by 0.
We want to find a partition of the graph such that the edges between different groups
h
i i
ij
G V E v x
w
=
≥
i
i
1
ave very low weights and the edges within a group have high weights.
( , ) be an undirected graph ( ) with vertex set { ,..., },
weighted adjacency matrix, degree matrix defined as
ij ji nG V E w w V v v= = =
W D
i
1
1
,
diagonal matrix.
1, if
A subset of vertices , indicator vector ( ,..., ) .
0, otherwise
For two disjoint sets , , ( , ) .
n
i ij
j
i iT n
A n
i
ij
i A j B
d w
f v A
A V f f
f
A B V W A B w
=
∈ ∈
=
= ∈
⊂ = ∈
=
⊂ =
1i ℝ
i
• U. Luxburg, “A tutorial on spectral clustering,” Statistics and Computing, vol. 17, no. 4, pp. 395–416, 2007
8. 8
Spectral Clustering - Similarity Graphs
2
2
-neighborhood graph :
Only include edges with distances , treat as unweighted.
0,
, .
,
-NN ( -Nearest Neighbor) graph :
Connect and if is a k-NN of
ij
ij i j ij
ij
i j j
s
s x x w
s
k k
v v v v
ε
ε
ε
ε ε
<
>
= − =
≤
i
i
, weighted by similarity .
Directed or undirected.
Mutual -NN graph :
Same as -NN, but only include mutual k-NN.
Fully connected graph :
longer distance lower similarity, Gaussian si
i ij ijw s
k
k
=
i
i
( )2 2
milarity function exp / (2 )i jx x σ− −
9. 9
Unnormalized Graph Laplacians and Their Basic Properties
2
, 1
Unnormalized Laplacian matrix .
is an undirected, weighted graph with weight matrix , where = 0.
1
1. we have ( ) .
2
2. is symmetric and positive semi-
ij ji
n
n T
ij i j
i j
G w w
w f f
=
≥
∀ ∈ = −
= −
W
f f
L
L
W
Lf
D
i
ℝ
1 2
denite.
3. The smallest eigenvalue of is 0, the corresponding eigenvector is the constant one vector .
4. has non-negative, real-valued eigenvalues 0 ... .
Invariance to self-edge
nn λ λ λ= ≤ ≤ ≤
L 1
L
i
1
s : , .
The multiplicity of the eigenvalue 0 of equals the number of connected components ,...,
in the graph. The eigenspace of eigenvalue 0 is spanned by the indicator ve
ii i ii ij ij
k
L d w L w
k A A
= − = −
Li
1
1
2
ctors ,..., of
those components.
, block diagonal form.
kA A
k
=
1 1
L
L
L
L
⋱
10. 10
Normalized Graph Laplacians and Their Basic Properties
1/2 1/2 1/2 1/2
1 1
ormalizN .
N
ed symmetry Laplacian matrix
ormalized random walk Laplacian matrix .
sym
rw
− − − −
− −
≡ = −
≡ = −
L D LD I D WD
L D L I D W
i
14. 14
Graph Cut Point of View
We want to find a partition of the graph such that the edges between different groups
have very low weights and the edges within a group have high weights.
Mincut approach simply consists in ch
i
i 1
1
1
oosing a partition ,..., which minimizes
1
cut( ,..., ) ( , ), where for the complement of .
2
k
k
k i i
i
A A
A A W A A A A
=
≡
17. 17
RatioCut
• Two measures of size of a subset:
1
1
Given a partition of into sets ,..., define indicator vectors
1/ , if
,..., , and set the matrix .
0, otherwise
min tr( ) . .n k
k
T n kj i j
j j nj j
T T
H
V k A A k
A v A
h h
s t×
×
∈
∈
= = ∈
=
h h H
H LH H H
ℝ
i
ℝ
i
nnormalized Lap
.
Correspondin lacian matrixg to u .= −L
I
D Wi
18. 18
NCut
1
1
Given a partition of into sets ,..., define indicator vectors
1/ vol( ), if
,..., , and set the matrix .
0, otherwise
min tr( )n k
k
T j i j n k
j j nj j
T
H
V k A A k
A v A
h h
×
×
∈
∈
= = ∈
h h H
H LH
ℝ
i
ℝ
i
1/2 1/2 1/2 1/2
1 1
ormalized symmetry Laplacian matrix
ormalized random walk La
. . .
Corresponding to
N .
placian mN .atrix
T
sym
rw
s t
− − − −
− −
=
≡ = −
≡ = −
H DH I
L D LD I D WD
L D L I D W
i
• Two measures of size of a subset:
19. 19
Low-Rank Representation (LRR)
2
1 1, ,
2,12,1,
SSC : min . . , , diag( ) .
2
LRR (noise-free) : min . . , .
LRR (noise or outliers) : min . . , , where -nor
T Tz
e F
T T
T T
e
s t
s t
s t
λ
λ
λ
∗
∗
+ + = + + = =
= =
+ = + =
C E Z
C
C E
C E Z Y YC E Z 1 C 1 C 0
C Y YC 1 C 1
C E Y YC E 1 C 1
i
i
i ℓ
2
2,1 1 1
m
.
N N
jkk j
E= =
= E
20. 20
Drawback of SSC/LRR
SSC:
• Disadvantage of SSC is that it is provably correct only in the case of independent or
disjoint subspaces.
LRR:
• Disadvantage of LRR is that it is provably correct only in the case of noiseless data drawn
from independent subspaces.
• Another drawback is that the optimization problem involves O(N2) variables.
21. 21
Subspace Clustering Model
1 2
1
Given a data set [ , ,..., ] belong to , where is clusters.
k
D N
N i
i
S k×
=
= ∈X x x xi ℝ ∪
• Each data sample xi is represented as a linear combination of other data samples. (Self-
expressiveness property)
( )
self-represe
, ( 1,..., ), 0 for
equivalent to where calle ntation (coefficied matrix.
Affinity matrix define as ,
nt)
Laplacian matrix .
2
i ij j i ij j
j i
N N
T
z S k z Sα αα
≠
×
= ∈ = = ∀ ∉
+
= −
= ∈
=
x x x x
X XZ Z
Z
L
Z
DW W
ℝ
xi
23. • Sparse Representation (compressive sensing):
• Sparse Subspace Clustering:
23
Sparse Models
0
min
. . .
z
s t y Az=
0
ˆ ˆ 1 1 1
min
. . , where [ ,..., , ,..., ].
i
i i i i Ni i
z
s t x X z X x x x x− += =
0
min
. . , diag( ) 0.s t = =
Z
X XZ Z
1
min
. . , diag( ) 0.s t = =
Z
X XZ Z
R. Vidal and E. Elhamifar, "Sparse subspace clustering," 2009 IEEE Conference on Computer Vision and Pattern Recognition(CVPR), Miami, FL, USA,
2009, pp. 2790-2797.
SIM (Shape Interaction Matrix)
24. • Matrix Completion (MC):
• Robust PCA:
• Low-Rank Representation (LRR) based clustering:
24
Low-Rank Models
min rank( ) . . ( ) ( ).
, if ( , )
where ( ) project operator
0, otherwise
ij
s t
D i j
π π
π
Ω Ω
Ω
=
∈Ω
=
A
A D A
D
0,
min rank( ) . . .s tλ+ = +
A E
A E D A E
Filling in missing entries
Denoising
2,0,
min rank( ) . . .s tλ+ = +
Z E
Z E X XZ E Clustering
NP Hard !
25. 25
Low-Rank Models - Convex Formulation
• Matrix Completion (MC):
• Robust PCA:
• Low-Rank Representation (LRR) based clustering:
min . . ( ) ( ).
, if ( , )
where ( ) project operator
0, otherwise
ij
s t
D i j
π π
π
Ω Ω∗
Ω
=
∈Ω
=
A
A D A
D
1,
min . . .s tλ∗
+ = +
A E
A E D A E
Filling in missing entries
Denoising
2,1,
min . . .s tλ∗
+ = +
Z E
Z E X XZ E Clustering
:,1 2,1 2
,
where ( ), , .i ij j
i i j j
E Eσ∗
= = = A A E E
26. 26
Traditional Framework of Sparse Subspace Clustering
Input data Clustering results
Subspace
represent
Spectral clustering
(e.g. Ncut)
Affinity matrix
W
Subspace
coefficient matrix Z
,
1 2 1 2
min ( ) ( ,
[ , ,..., ] , [ , ,..., ] , is error matrix
( , ) denotes loss function, ( ) is regularizer and
is hyperparameter that control
)
s the intensit
. .
y
.
D N N N
N N
L
R L
R
s tλ
λ
× ×
+ =
=
⋅
+
= ∈ ∈
⋅ ⋅
Z E
Z X XZ X XZ
X x Z z z z E
E
x x ℝ ℝ
of the loss penalty.
1
min
. . , diag( ) 0.s t = =
Z
Z
X XZ Z
min
. . , diag( ) 0.s t
∗
= =
Z
Z
X XZ Z
2,1,
min . . .s tλ∗
+ = +
Z E
Z E X XZ E
• General form of subspace clustering representation:
1/2 1/2
RatioCut : arg min tr( ) . . ,
NCut :arg min tr( ) . . .
Spectral clustering :
T T
T T
s t
s t− −
=
=
H
H
H LH H H I
F D LD F F F I
or
27. 27
Database Introduction and How to Deal With
• ORL Database: https://www.cl.cam.ac.uk/research/dtg/attarchive/facedatabase.html
• In orl_faces folder include 40 subfolders (= 40 different people), each subfolder has 10 pictures
(same person) files are in PGM format. The size of each image is 92 x 112 pixels, with 256 grey
levels per pixel.
Downsize to 32 x 32 pixels
and normalized to [0,1].
32
32
400
1024
400
1 2
1
data set [ , ,..., ]
belong to , where is clusters
D N
N
k
i
i
S k
×
=
= ∈X x x x ℝ
∪
×1024 400
=X
or 500
100
∗
=X
PCA
and we hope 40
in this example.
k =
28. 28
Evaluation Metrics
• Normalized Mutual Information (NMI)
• Accuracy (ACC)
• Adjusted Rand Index (ARI)
• F-score
• Precision
• Recall
• Representation Visualization
1 2 1 2
1 1
[ , ,..., ], [ , ,..., ]
( , ) MI( , )
MI( , ) ( , )log , NMI( , )
( ) ( )
entr
( ) ( )
( ) ( )log ( ).opy
r s
r s
i j
k kk
X x x x Y y y y
P i j X Y
X Y P i j X Y
P i P j H X H Y
H P x P x
= =
= =
= =
= −
29. 29
SC Novel Research Approach Including
(1) Design the R(Z) regularization term let the coefficient matrix Z has a good clustering
structure and without losing realistic data hypothesis.
(2) Design L(.) term which robustness to any kind of noise (not just for Gaussian noise).
Bayesian estimation maybe a good way to approach.
(3) Fast and high accuracy algorithm design.
(4) Application exploration.
,
1 2 1 2
min ( ) ( ,
[ , ,..., ] , [ , ,..., ] , is error matrix
( , ) denotes loss function, ( ) is regularizer and
is hyperparameter that control
)
s the intensit
. .
y
.
D N N N
N N
L
R L
R
s tλ
λ
× ×
+ =
=
⋅
+
= ∈ ∈
⋅ ⋅
Z E
Z X XZ X XZ
X x Z z z z E
E
x x ℝ ℝ
of the loss penalty.
30. 30
Drawback of SSC/LRR and Model We Proposed
SSC:
• Disadvantage of SSC is that it is provably correct only in the case of independent or
disjoint subspaces.
LRR:
• Disadvantage of LRR is that it is provably correct only in the case of noiseless data drawn
from independent subspaces.
• Another drawback is that the optimization problem involves O(N2) variables.
Model we proposed:
1 2,1,
2
2,1 2,1 1 1
SSC LRR : min . . , , diag( ) .
where -norm .
T T
N N
jkk j
s t
E
λ∗
= =
+ + + = + + = =
=
C E
C C E Y YC E Z 1 C 1 C 0
E
i
ℓ
31. 31
Why use L2,1-Norm ?
2
1
2,1
• Illustrating three typical types of errors :
(a) For the small Gaussian noise.
(b) For the random corruptions noise.
(c) For sample-specific corruptions noise and outliers .( )
F
E
E
E
1 2,1,
2
2,1 2,1 1 1
SSC LRR : min . . , , diag( ) .
where -norm .
T T
N N
jkk j
s t
E
λ∗
= =
+ + + = + + = =
=
C E
C C E Y YC E Z 1 C 1 C 0
E
i
ℓ
G. Liu, Z. Lin, S. Yan, J. Sun, Y. Yu, and Y. Ma, “Robust recovery of subspace structures by low-rank representation,” IEEE Trans. Pattern Anal. Mach.
Intell., vol. 35, no. 1, pp. 171–184, Jan. 2013.
32. 32
Research Interests – YW Peter Hong
Signal Processing & Communication
Mobile Communication
UAV Communications
Machine Learning & Artificial Intelligence
User
Connected UAVs
Beamforming
Broadcasting
physical
connection
Deep Learning based
Communications
Crowdsourcing & IoT
Distributed Learning
Recommender Systems
37. n data samples
cluster
1
cluster
K
…
n data samples
cluster
1
cluster
K
…
Subspace Clustering (SC)
• Subspace Clustering: Data points from the same
cluster should belong to the same subspace.
Self Representation Property
n data samples
cluster
1
cluster
K
…
n data samples
cluster
1
cluster
K
…
Different views, but
similar representation
coefficients.
37
40. Feature Extraction for Incomplete
Data Via Low-Rank Tensor
Decomposition With Feature
Regularization
40
Q. Shi, Y. M. Cheung, Q. Zhao, H. Lu, "Feature Extraction for Incomplete Data Via Low-Rank Tensor Decomposition With Feature Regularization,"
IEEE Transactions on Neural Networks and Learning Systems, 2018
41. 41
Extracting Features From Incomplete Tensors
• Old fashioned:
• Tensor completion:
• Predicting missing data and recovery.
• Without considering the relationship among data samples for feature extraction.
• Tensor completion methods + Feature extraction methods:
• Amplifies the approximation error as the missing data and the features are learned in
separate stages.
• Low efficiency.
• To solve the problem, low-rank Tensor Decomposition with feature Variance Maximization
(TDVM) into a unified framework, TDVM-Tucker and TDVM-CP was introduced.
• Simultaneously estimates missing data via low-rank approximation and explores the
relationship among samples via feature regularization.
• Tucker/CP decomposition for low-rank approximation, feature variance maximization as
the feature constraint.
• TDVM-Tucker: minimizing tensor nuclear norm on the core tensors meanwhile maximizes the
variance of core tensors, core tensors = extracted features.
• TDVM-CP: minimizing tensor nuclear norm of weight vectors meanwhile maximizes the
variance of learned feature vectors for feature regularization, weight vector = feature vector.
42. 42
Tucker Decomposition
tensor with , , ,I J K P Q R I P J Q K R× × × × × × ×
∈ ∈ ∈ ∈ ∈A B Cℝ ℝ ℝ ℝ ℝX GX GX GX G
core tensor
factor matrix
factor matrix
factor matrix
The factor matrices (which are usually orthogonal) A, B, and C are often referred to as
the principal component in the respective tensor mode.
this will result in a compression of XXXX, with GGGG being the compressed version of XXXX.
is Kronecker product⊗
43. • Factor matrices:
• Columns of A, B, and C are normalized to length one
• For a general Nth-order tensor
43
Canonical Polyadic (CP) Decomposition
1λ Rλ
44. 44
TDVM-Tucker: Learning Low-dimensional Tensor Features
maximize the variance of learned features
low-rank Tucker approximation, which aims to minimize the reconstruction
error and obtains low-dimensional features.
46. Low-Rank Regularized
Heterogeneous Tensor
Decomposition (LRRHTD) for
Subspace Clustering
46
J. Zhang, X. Li, P. Jing, J. Liu, Y. Su, "Low-rank regularized heterogeneous tensor decomposition for subspace clustering", IEEE Signal Process. Lett.,
vol. 25, no. 3, pp. 333-337, Mar. 2018.
47. 47
Tucker Decomposition
tensor with , , ,I J K P Q R I P J Q K R× × × × × × ×
∈ ∈ ∈ ∈ ∈A B Cℝ ℝ ℝ ℝ ℝX GX GX GX G
core tensor
factor matrix
factor matrix
factor matrix
The factor matrices (which are usually orthogonal) A, B, and C are often referred to as
the principal component in the respective tensor mode.
this will result in a compression of XXXX, with GGGG being the compressed version of XXXX.
is Kronecker product⊗
48. 48
Tucker Decomposition
• Tucker model can be generalized to N-way tensors
• The concept of n-rank (denoted by rankn(XXXX)): corresponds to the column rank of the n-th
unfolding of the tensor XXXX.
• According to the type of constraints Tucker decomposition approaches can be roughly grouped
into three categories:
• orthogonal tensor decomposition
• non-negative tensor decomposition
• sparse tensor decomposition
• Almost all of the above algorithms decompose tensors based on the isotropy hypothesis (i.e.
orthogonal, non-negative...), meaning that the factor matrices are learned in an equivalent way
for all modes.
• Not suitable for heterogeneous tensor data.
( )rank ( ) rank( )n n= XXXXX
49. • For all but the last mode, LRRHTD seeks a set of orthogonal projection matrices to map the
original tensor data into a low-dimensional common subspace.
• But for the last mode, a low-rank projection matrix is learned by imposing a nuclear-norm so that
a lowest rank representation that reveals the global structure of samples is obtained for
performing clustering.
• M-th order tensors, N is the total number of samples:
• We concatenate the N tensors to yield a (M + 1)-th order tensor
• The goal of LRRHTD is to find M orthogonal factor matrices for intrinsic
low-dimensional representation and the lowest rank representation using the mapped
low-dimensional tensor as a dictionary, and D < N.
49
Low-Rank Regularized Heterogeneous Tensor
Decomposition (LRRHTD)
50. • Tucker decomposition of the concatenated tensor XXXX can be estimated in a general form as follows:
where is the core tensor and is the approximation error tensor.
• Cost function:
and
50
Low-Rank Regularized Heterogeneous Tensor
Decomposition (LRRHTD)
arg min
52. Multiview Subspace Clustering
via Tensorial t-Product
Representation
52
M. Yin, J. Gao, S. Xie and Y. Guo, "Multiview Subspace Clustering via Tensorial t-Product Representation,“ IEEE Transactions on Neural Networks
and Learning Systems, pp.1-14, 2018.
53. 53
• Multiview subspace clustering is based on the fact that multiview data are generated from a
latent subspace.
• An object can be characterized by a color view and/or a shape view.
• An image can be depicted by color histogram, Fourier shape information.
• Multiview clustering utilizes the complementary information of objects from different feature
spaces.
• Challenge: data from different views show large divergence or heterogeneousness.
• Most previous methods focus on capturing pairwise correlations between different views,
rather than the higher order correlation underlying the multiview data.
• t-product, one type of tensor–tensor products, was introduced to provide a matrixlike
multiplication for the third-order tensors.
• Gives a better way (convolution operation) of exploiting the intrinsic structure of the
higher order tensors.
Main Idea
54. 54
Conventional Methods Unfold the Data to Vectors
• ORL Database: https://www.cl.cam.ac.uk/research/dtg/attarchive/facedatabase.html
• In orl_faces folder include 40 subfolders (= 40 different people), each subfolder has 10 pictures
(same person) files are in PGM format. The size of each image is 92 x 112 pixels, with 256 grey
levels per pixel.
Downsize to 32 x 32 pixels
and normalized to [0,1].
32
32
400
1024
400
1 2
1
data set [ , ,..., ]
belong to , where is clusters
D N
N
k
i
i
S k
×
=
= ∈X x x x ℝ
∪
×1024 400
=X
or 500
100
∗
=X
PCA
and we hope 40
in this example.
k =
55. 55
• Conventional methods usually unfold the data to vectors.
• But blindly vectorizing may cause the problem of “curse of dimensionality” and also
damage the second-order structure within data, such as spatial information.
Drawback of Conventional Methods
56. 56
t-Linear Combination for Third-Order Tensor
• Traditional tensor methods (CP, Tucker and HOSVD) are not directly applicable to the third-
order tensors.
• Kilmer et al. presented t-product to define a matrixlike multiplication for third-order tensors.
Called t-Product.
M. E. Kilmer, K. Braman, N. Hao, and R. C. Hoover, “Third-order tensors as operators on matrices: A theoretical and computational framework with
applications in imaging,” SIAM J. Matrix Anal. Appl., vol. 34, no. 1, pp. 148–172, 2013.
matrixm n× 1 tensorm n× ×
3-order tensor is a linear combination of
basis with coefficients .
where is circular convolution operation.∗
AAAA
X BX BX BX B
i
t-linear combination:
57. 57
• A tensor signal represented by a t-linear combination of K tensor dictionary atoms.
t-Linear Combination
59. 59
Sparse and Low-Rank Representation Using
the t-Linear Combination
• We propose to seek the most sparse and lowest rank representation of multiview data by
employing the self-expressive property.
Most sparse Lowest rank
60. 60
• We propose to seek the most sparse and lowest rank representation of multiview data by
employing the self-expressive property. (continued…)
• fits the representation errors in the 3-order tensor space by using the t-linear combination.
• Different from the previous matricization along certain dimensions, the block circulant
matricization will preserve more spatial correlation within data.
• t-linear combination is a generalization of the standard linear combination.
• is imposed for multiview data encourages consensus representation by forcing all the lowest
rank coefficients close in all views.
63. 63
Tensor LRR and
Sparse Coding-Based
Subspace Clustering
Y. Fu, J. Gao, D. Tien, Z. Lin, and X. Hong, “Tensor LRR and sparse coding-based subspace clustering,” IEEE Trans. Neural Netw. Learn. Syst.,
vol. 27, no. 9, pp. 2120–2133, Sep. 2016.
#有德
64. 64
Main Idea
Problems
• Traditional subspace clustering algorithms may be compromised in practical applications:
1. Don’t consider the inherent structure and correlations in the original data.
2. Original high-dimensional features is not effective to filter the noisy/redundant information
in the original feature spaces, and the time complexity grows exponentially with dimensions.
Solutions
1. Finds a lowest rank representation for the input tensor, which can be further used to build an
affinity matrix and then do the spectral clustering.
2. Dictionary learning: finding low-dimensional inherent feature spaces.
• Sparse representation: define holistic sparsity on a whole data representation matrix.
• Sparse coding (SC): finds the sparsest representation of each data vector individually.
• Drawback: SC deteriorates when data are corrupted.
Proposed model
• Input data are represented in their original structural form a tensor.
• Finds the lowest rank representation for each spatial mode of the input tensor.
• Sparse representation with respect to a learned dictionary in the feature mode.
• Spatial spaces + feature spaces to build an affinity matrix for spectral clustering.
65. 65
Dictionary Learning
• Dictionary learning for sparse representation aims at learning a dictionary D, such that each
sample in the data set can be represented as a sparse linear combination of the atoms of D.
• Optimization (e.g. K-SVD):
1. Update zi (SC) by fixing the dictionary D.
2. Update dictionary with a fixed sparse representation.
66. • Tucker Decomposition:
• Proposed data representation model:
if we let and
• The entries of Z interprets as the similarities between the pairs of all the vectors along the N-mode
of the data tensor X.
• Subspace clustering self-expressiveness: aim to find a linear representation, Z, for all the samples
in terms of all other samples.
66
Tensor Low-Rank Representation on Spatial Modes
core tensor factor matrices
Kronecker product
b = 1 in SSC
b = * in LRR
spatial modes
67. 67
Dictionary Learning for Sparse Representation
on Feature Mode
• We consider a dictionary learning model for sparse representation along the N-mode
(feature mode) of X.
• Feature vectors of each sample can be represented by a few atoms of D.
dictionary to be learned.
mode-N matricization of tensor X(N)
sparse representation on feature spaces.
• Soft reminder TLRR in spatial modes:
feature mode
is the maximum # nonzero elements in
the matrix A.
68. • Try to combine spatial modes + feature mode.
• Define a transformation of inverse matricization converts a matrix back into an order-N tensor.
•
68
Tensor LRR model
Proposed model
Tensor Spatial Low-Rank Representation
and Feature Sparse Coding
where
feature mode
spatial modes
1 2 1( ... )
where is the dictionary to be learned and
is the sparse representation coefficient matr
feature
ix.
modeN
N
I m
m I I I −
×
×
∈
∈
D
A
ℝ
ℝ
• Finally
proposed
learning model
where R is a given sparsity.
69. • Finally proposed learning model
• Aims:
• Find the lowest rank representations along all the spatial modes.
• Learn a dictionary with its sparse representation over the samples on the feature mode.
At the same time.
69
Tensor LRR model
Proposed model
b = 1 in SSC
b = * in LRR
spatial modes
feature mode
72. 72
ε= +y Xy Xy Xy X
Proposed Methodology
incomplete tensor data
corrupted by noise and outliers clean tensor data
noise tensor
{ }1
1
...
...
1 1 2 2 3 1
2
1
1 1 2 2
, 1,..., is number of samples data
construct a 1 order tensor
...
objective function : min
2
. . ...
M
M
I I
I I N
M M M
F
T T T
M M M
i N N
M
s t
λ
ε
× ×
× × ×
+
∗
∈ =
+ ∈
= × × × × ×
+ +
= = = =
Z
A A A Z
Z Z
A A A A A A I
iiiiXXXX
XXXX
X GX GX GX G
ℝ
ℝ
If:
b = 1 in SSC (Sparse Subspace Clustering)
b = * in LRR (Low Rank Representation)
• Analog to
73. 73
Bayesian Tensor Factorization Probabilistic Model
{ }1
1
...
...
1 1 2 2 3 1
2
1
1 1 2 2
, 1,..., is number of samples data
construct a 1 order tensor
...
objective function : min
2
. . ...
M
M
I I
I I N
M M M
F
T T T
M M M
i N N
M
s t
ε
λ
ε
× ×
× × ×
+
∗
∈ =
+ ∈
= +
= × × × × ×
+ +
= = = =
Z
A A A Z
Z Z
A A A A A A I
iiiiXXXX
XXXX
y Xy Xy Xy X
X GX GX GX G
ℝ
ℝ
Using MAP estimation
max log-joint distribution
Predictive distribution over missing entries
Z
Try to add a
prior on Z
base on SSC
and LRR.
74. 74
Model Learning via Bayesian Inference
• Because integrate over all latent variables as well as hyperparameters is intractable. So
we use variational Bayesian (VB) framework.
• Based on mean-field approximation
We seek a distribution ( ) to approximate the true posterior distribution ( | )
by min KL divergence.
q q ΩΘ Θ yyyyi
75. 75
PARAFAC-based Multilinear
Subspace Clustering
for Tensor data
P. A. Traganitis and G. B. Giannakis, “PARAFAC-based multilinear subspace clustering for tensor data,” in Proc. IEEE Global Conf. Signal Inf. Process.,
Washington, DC, USA, 2016, pp. 1280–1284.
76. 76
Main Idea
• Subspace clustering
• K
aim to find
• When data are high-dimensional both LRR and SSC have high computation complexity.
,
columns form a basis of ,
is low-dimensional representation of in ,
is the "centroid" or intercept of ,
is noise vector,
cluster assignment vector for ,
[ ] de
k
k
D d
k k
d
k n n k
D
k k
D
n
n n
n k
S
S
S
×
∈
∈
∈
∈
U
y x
m
v
π x
π
ℝ
ℝ
ℝ
ℝ
notes the th entry of .nk π
hard clustering
soft/probabilistic clustering
77. 77
• This work aims to extend the union of subspaces model, used by the SC algorithms, to a union
of multilinear subspaces.
• This given data can be reshaped in a tensor format.
• Data in the same cluster share the factor matrices Bk, Ak, and differ only in the factor matrix Ck.
Implies that each cluster k can be represented by a tensor Xk.
Main Idea
81. 81
Five Pillars Of The Mamba Mentality:
1. Be Passionate.
2. Be Obsessive.
3. Be Relentless.
4. Be Resilient.
5. Be Fearless.
“Obsessiveness is having the attention to detail for the action you are performing at the time
you’re performing it.”
“Success is the ability to use your passion to help someone else discover their passion.”
https://www.youtube.com/watch?v=NLElzEJPceA