SpecNet2: Orthogonalization-free spectral embedding by neural networks
- URL: http://arxiv.org/abs/2206.06644v1
- Date: Tue, 14 Jun 2022 07:19:10 GMT
- Title: SpecNet2: Orthogonalization-free spectral embedding by neural networks
- Authors: Ziyu Chen, Yingzhou Li, Xiuyuan Cheng
- Abstract summary: This paper introduces a new neural network approach, named SpecNet2, to compute spectral embedding.
SpecNet2 also allows separating the sampling of rows and columns of the graph affinity matrix.
Numerical experiments demonstrate the improved performance and computational efficiency of SpecNet2 on simulated data and image datasets.
- Score: 11.688030627514532
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Spectral methods which represent data points by eigenvectors of kernel
matrices or graph Laplacian matrices have been a primary tool in unsupervised
data analysis. In many application scenarios, parametrizing the spectral
embedding by a neural network that can be trained over batches of data samples
gives a promising way to achieve automatic out-of-sample extension as well as
computational scalability. Such an approach was taken in the original paper of
SpectralNet (Shaham et al. 2018), which we call SpecNet1. The current paper
introduces a new neural network approach, named SpecNet2, to compute spectral
embedding which optimizes an equivalent objective of the eigen-problem and
removes the orthogonalization layer in SpecNet1. SpecNet2 also allows
separating the sampling of rows and columns of the graph affinity matrix by
tracking the neighbors of each data point through the gradient formula.
Theoretically, we show that any local minimizer of the new
orthogonalization-free objective reveals the leading eigenvectors. Furthermore,
global convergence for this new orthogonalization-free objective using a
batch-based gradient descent method is proved. Numerical experiments
demonstrate the improved performance and computational efficiency of SpecNet2
on simulated data and image datasets.
Related papers
- A Library of Mirrors: Deep Neural Nets in Low Dimensions are Convex Lasso Models with Reflection Features [54.83898311047626]
We consider neural networks with piecewise linear activations ranging from 2 to an arbitrary but finite number of layers.
We first show that two-layer networks with piecewise linear activations are Lasso models using a discrete dictionary of ramp depths.
arXiv Detail & Related papers (2024-03-02T00:33:45Z) - Neural Tangent Kernels Motivate Graph Neural Networks with
Cross-Covariance Graphs [94.44374472696272]
We investigate NTKs and alignment in the context of graph neural networks (GNNs)
Our results establish the theoretical guarantees on the optimality of the alignment for a two-layer GNN.
These guarantees are characterized by the graph shift operator being a function of the cross-covariance between the input and the output data.
arXiv Detail & Related papers (2023-10-16T19:54:21Z) - A theory of data variability in Neural Network Bayesian inference [0.70224924046445]
We provide a field-theoretic formalism which covers the generalization properties of infinitely wide networks.
We derive the generalization properties from the statistical properties of the input.
We show that data variability leads to a non-Gaussian action reminiscent of a ($varphi3+varphi4$)-theory.
arXiv Detail & Related papers (2023-07-31T14:11:32Z) - Provable Data Subset Selection For Efficient Neural Network Training [73.34254513162898]
We introduce the first algorithm to construct coresets for emphRBFNNs, i.e., small weighted subsets that approximate the loss of the input data on any radial basis function network.
We then perform empirical evaluations on function approximation and dataset subset selection on popular network architectures and data sets.
arXiv Detail & Related papers (2023-03-09T10:08:34Z) - Unfolding Projection-free SDP Relaxation of Binary Graph Classifier via
GDPA Linearization [59.87663954467815]
Algorithm unfolding creates an interpretable and parsimonious neural network architecture by implementing each iteration of a model-based algorithm as a neural layer.
In this paper, leveraging a recent linear algebraic theorem called Gershgorin disc perfect alignment (GDPA), we unroll a projection-free algorithm for semi-definite programming relaxation (SDR) of a binary graph.
Experimental results show that our unrolled network outperformed pure model-based graph classifiers, and achieved comparable performance to pure data-driven networks but using far fewer parameters.
arXiv Detail & Related papers (2021-09-10T07:01:15Z) - Semiparametric Bayesian Networks [5.205440005969871]
We introduce semiparametric Bayesian networks that combine parametric and nonparametric conditional probability distributions.
Their aim is to incorporate the bounded complexity of parametric models and the flexibility of nonparametric ones.
arXiv Detail & Related papers (2021-09-07T11:47:32Z) - Random Features for the Neural Tangent Kernel [57.132634274795066]
We propose an efficient feature map construction of the Neural Tangent Kernel (NTK) of fully-connected ReLU network.
We show that dimension of the resulting features is much smaller than other baseline feature map constructions to achieve comparable error bounds both in theory and practice.
arXiv Detail & Related papers (2021-04-03T09:08:12Z) - Certifying Incremental Quadratic Constraints for Neural Networks via
Convex Optimization [2.388501293246858]
We propose a convex program to certify incremental quadratic constraints on the map of neural networks over a region of interest.
certificates can capture several useful properties such as (local) Lipschitz continuity, one-sided Lipschitz continuity, invertibility, and contraction.
arXiv Detail & Related papers (2020-12-10T21:15:00Z) - Deep neural networks for inverse problems with pseudodifferential
operators: an application to limited-angle tomography [0.4110409960377149]
We propose a novel convolutional neural network (CNN) designed for learning pseudodifferential operators ($Psi$DOs) in the context of linear inverse problems.
We show that, under rather general assumptions on the forward operator, the unfolded iterations of ISTA can be interpreted as the successive layers of a CNN.
In particular, we prove that, in the case of LA-CT, the operations of upscaling, downscaling and convolution, can be exactly determined by combining the convolutional nature of the limited angle X-ray transform and basic properties defining a wavelet system.
arXiv Detail & Related papers (2020-06-02T14:03:41Z) - Revealing the Structure of Deep Neural Networks via Convex Duality [70.15611146583068]
We study regularized deep neural networks (DNNs) and introduce a convex analytic framework to characterize the structure of hidden layers.
We show that a set of optimal hidden layer weights for a norm regularized training problem can be explicitly found as the extreme points of a convex set.
We apply the same characterization to deep ReLU networks with whitened data and prove the same weight alignment holds.
arXiv Detail & Related papers (2020-02-22T21:13:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.