Related papers: Sparse Graph Learning with Eigen-gap for Spectral Filter Training in Graph Convolutional Networks

Sparse Graph Learning with Eigen-gap for Spectral Filter Training in Graph Convolutional Networks

URL: http://arxiv.org/abs/2202.13526v1
Date: Mon, 28 Feb 2022 03:39:48 GMT
Title: Sparse Graph Learning with Eigen-gap for Spectral Filter Training in Graph Convolutional Networks
Authors: Jin Zeng, Saghar Bagheri, Yang Liu, Gene Cheung, Wei Hu
Abstract summary: We show that a sparse graph Laplacian matrix $L$ closest to $barC-1$ can be used to promote a deeper graph convolutional neural net (GCN) architecture. Experiments show that our proposal produced deeper GCNs and smaller errors compared to a competing scheme without explicit eigen-gap optimization.
Score: 38.92746674849527
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: It is now known that the expressive power of graph convolutional neural nets (GCN) does not grow infinitely with the number of layers. Instead, the GCN output approaches a subspace spanned by the first eigenvector of the normalized graph Laplacian matrix with the convergence rate characterized by the "eigen-gap": the difference between the Laplacian's first two distinct eigenvalues. To promote a deeper GCN architecture with sufficient expressiveness, in this paper, given an empirical covariance matrix $\bar{C}$ computed from observable data, we learn a sparse graph Laplacian matrix $L$ closest to $\bar{C}^{-1}$ while maintaining a desirable eigen-gap that slows down convergence. Specifically, we first define a sparse graph learning problem with constraints on the first eigenvector (the most common signal) and the eigen-gap. We solve the corresponding dual problem greedily, where a locally optimal eigen-pair is computed one at a time via a fast approximation of a semi-definite programming (SDP) formulation. The computed $L$ with the desired eigen-gap is normalized spectrally and used for supervised training of GCN for a targeted task. Experiments show that our proposal produced deeper GCNs and smaller errors compared to a competing scheme without explicit eigen-gap optimization.

Related papers

Multi-Community Spectral Clustering for Geometric Graphs [0.0699049312989311]
We introduce a spectral clustering algorithm for community recovery on graphs generated by this model.<n>We prove weak consistency and show that a simple local refinement step ensures strong consistency.<n>A key ingredient is an application of a non-standard version of Davis-Kahan theorem to control eigenspaces when eigenvalues are not simple.
arXiv Detail & Related papers (2025-07-27T14:09:00Z)
A Spectral Interpretation of Redundancy in a Graph Reservoir [51.40366905583043]
This work revisits the definition of the reservoir in the Multiresolution Reservoir Graph Neural Network (MRGNN)<n>It proposes a variant based on a Fairing algorithm originally introduced in the field of surface design in computer graphics.<n>The core contribution of the paper lies in the theoretical analysis of the algorithm from a random walks perspective.
arXiv Detail & Related papers (2025-07-17T10:02:57Z)
SpectralGap: Graph-Level Out-of-Distribution Detection via Laplacian Eigenvalue Gaps [19.580332929984028]
We propose SpecGap, an effective post-hoc approach for OOD detection on graphs.<n> SpecGap achieves state-of-the-art performance across multiple benchmark datasets.
arXiv Detail & Related papers (2025-05-21T06:47:44Z)
LASE: Learned Adjacency Spectral Embeddings [7.612218105739107]
We learn nodal Adjacency Spectral Embeddings (ASE) from graph inputs. LASE is interpretable, parameter efficient, robust to inputs with unobserved edges. LASE layers combine Graph Convolutional Network (GCN) and fully-connected Graph Attention Network (GAT) modules.
arXiv Detail & Related papers (2024-12-23T17:35:19Z)
Convergence Analysis of Natural Gradient Descent for Over-parameterized Physics-Informed Neural Networks [3.680127959836384]
First-order methods, such as gradient descent (GD) and quadratic descent gradient (SGD) have been proven effective in training neural networks. However, the learning rate of GD for training two-layer neural networks exhibits poor dependence on the sample size and the Gram matrix. In this paper, we show that for the $L2$ regression problems, the learning rate can be improved from $mathcalO(1)$ to $mathcalO(1)$, which implies that GD actually enjoys a faster convergence rate.
arXiv Detail & Related papers (2024-08-01T14:06:34Z)
Scalable Graph Compressed Convolutions [68.85227170390864]
We propose a differentiable method that applies permutations to calibrate input graphs for Euclidean convolution. Based on the graph calibration, we propose the Compressed Convolution Network (CoCN) for hierarchical graph representation learning.
arXiv Detail & Related papers (2024-07-26T03:14:13Z)
Implicit Bias and Fast Convergence Rates for Self-attention [30.08303212679308]
Self-attention, the core mechanism of transformers, distinguishes them from traditional neural networks and drives their outstanding performance. We investigate the implicit bias of gradient descent (GD) in training a self-attention layer with fixed linear decoder in binary. We provide the first finite-time convergence rate for $W_t$ to $W_mm$, along with the rate of sparsification in the attention map.
arXiv Detail & Related papers (2024-02-08T15:15:09Z)
Stable Nonconvex-Nonconcave Training via Linear Interpolation [51.668052890249726]
This paper presents a theoretical analysis of linearahead as a principled method for stabilizing (large-scale) neural network training. We argue that instabilities in the optimization process are often caused by the nonmonotonicity of the loss landscape and show how linear can help by leveraging the theory of nonexpansive operators.
arXiv Detail & Related papers (2023-10-20T12:45:12Z)
One-step Bipartite Graph Cut: A Normalized Formulation and Its Application to Scalable Subspace Clustering [56.81492360414741]
We show how to enforce a one-step normalized cut for bipartite graphs, especially with linear-time complexity. In this paper, we first characterize a novel one-step bipartite graph cut criterion with normalized constraints, and theoretically prove its equivalence to a trace problem. We extend this cut criterion to a scalable subspace clustering approach, where adaptive anchor learning, bipartite graph learning, and one-step normalized bipartite graph partitioning are simultaneously modeled.
arXiv Detail & Related papers (2023-05-12T11:27:20Z)
Stochastic Parallelizable Eigengap Dilation for Large Graph Clustering [12.544602297450533]
We aim to identify clusters of nodes where most edges fall within clusters and only few edges fall between clusters. A core step of spectral clustering is performing an eigendecomposition of the corresponding graph Laplacian matrix. This paper introduces a parallelizable approach to dilating the spectrum in order to accelerate SVD solvers and in turn, spectral clustering.
arXiv Detail & Related papers (2022-07-29T10:13:07Z)
SpecNet2: Orthogonalization-free spectral embedding by neural networks [11.688030627514532]
This paper introduces a new neural network approach, named SpecNet2, to compute spectral embedding. SpecNet2 also allows separating the sampling of rows and columns of the graph affinity matrix. Numerical experiments demonstrate the improved performance and computational efficiency of SpecNet2 on simulated data and image datasets.
arXiv Detail & Related papers (2022-06-14T07:19:10Z)
Hybrid Model-based / Data-driven Graph Transform for Image Coding [54.31406300524195]
We present a hybrid model-based / data-driven approach to encode an intra-prediction residual block. The first $K$ eigenvectors of a transform matrix are derived from a statistical model, e.g., the asymmetric discrete sine transform (ADST) for stability. Using WebP as a baseline image, experimental results show that our hybrid graph transform achieved better energy compaction than default discrete cosine transform (DCT) and better stability than KLT.
arXiv Detail & Related papers (2022-03-02T15:36:44Z)
Unfolding Projection-free SDP Relaxation of Binary Graph Classifier via GDPA Linearization [59.87663954467815]
Algorithm unfolding creates an interpretable and parsimonious neural network architecture by implementing each iteration of a model-based algorithm as a neural layer. In this paper, leveraging a recent linear algebraic theorem called Gershgorin disc perfect alignment (GDPA), we unroll a projection-free algorithm for semi-definite programming relaxation (SDR) of a binary graph. Experimental results show that our unrolled network outperformed pure model-based graph classifiers, and achieved comparable performance to pure data-driven networks but using far fewer parameters.
arXiv Detail & Related papers (2021-09-10T07:01:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.