Rank Collapse Causes Over-Smoothing and Over-Correlation in Graph Neural Networks
- URL: http://arxiv.org/abs/2308.16800v3
- Date: Tue, 17 Sep 2024 19:19:17 GMT
- Title: Rank Collapse Causes Over-Smoothing and Over-Correlation in Graph Neural Networks
- Authors: Andreas Roth, Thomas Liebig,
- Abstract summary: We show that with increased depth, node representations become dominated by a low-dimensional subspace that depends on the aggregation function but not on the feature transformations.
For all aggregation functions, the rank of the node representations collapses, resulting in over-smoothing for particular aggregation functions.
- Score: 3.566568169425391
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Our study reveals new theoretical insights into over-smoothing and feature over-correlation in graph neural networks. Specifically, we demonstrate that with increased depth, node representations become dominated by a low-dimensional subspace that depends on the aggregation function but not on the feature transformations. For all aggregation functions, the rank of the node representations collapses, resulting in over-smoothing for particular aggregation functions. Our study emphasizes the importance for future research to focus on rank collapse rather than over-smoothing. Guided by our theory, we propose a sum of Kronecker products as a beneficial property that provably prevents over-smoothing, over-correlation, and rank collapse. We empirically demonstrate the shortcomings of existing models in fitting target functions of node classification tasks.
Related papers
- Unitary convolutions for learning on graphs and groups [0.9899763598214121]
We study unitary group convolutions, which allow for deeper networks that are more stable during training.
The main focus of the paper are graph neural networks, where we show that unitary graph convolutions provably avoid over-smoothing.
Our experimental results confirm that unitary graph convolutional networks achieve competitive performance on benchmark datasets.
arXiv Detail & Related papers (2024-10-07T21:09:14Z) - LSEnet: Lorentz Structural Entropy Neural Network for Deep Graph Clustering [59.89626219328127]
Graph clustering is a fundamental problem in machine learning.
Deep learning methods achieve the state-of-the-art results in recent years, but they still cannot work without predefined cluster numbers.
We propose to address this problem from a fresh perspective of graph information theory.
arXiv Detail & Related papers (2024-05-20T05:46:41Z) - Learning to Approximate Adaptive Kernel Convolution on Graphs [4.434835769977399]
We propose a diffusion learning framework, where the range of feature aggregation is controlled by the scale of a diffusion kernel.
Our model is tested on various standard for node-wise classification for the state-of-the-art datasets performance.
It is also validated on a real-world brain network data for graph classifications to demonstrate its practicality for Alzheimer classification.
arXiv Detail & Related papers (2024-01-22T10:57:11Z) - A Neural Collapse Perspective on Feature Evolution in Graph Neural
Networks [44.31777384413466]
Graph neural networks (GNNs) have become increasingly popular for classification tasks on graph-structured data.
In this paper, we focus on node-wise classification and explore the feature evolution through the lens of the "Neural Collapse" phenomenon.
We show that even an "optimistic" mathematical model requires that the graphs obey a strict structural condition in order to possess a minimizer with exact collapse.
arXiv Detail & Related papers (2023-07-04T23:03:21Z) - Graph Neural Networks Provably Benefit from Structural Information: A
Feature Learning Perspective [53.999128831324576]
Graph neural networks (GNNs) have pioneered advancements in graph representation learning.
This study investigates the role of graph convolution within the context of feature learning theory.
arXiv Detail & Related papers (2023-06-24T10:21:11Z) - What functions can Graph Neural Networks compute on random graphs? The
role of Positional Encoding [0.0]
We aim to deepen the theoretical understanding of Graph Neural Networks (GNNs) on large graphs, with a focus on their expressive power.
Recently, several works showed that, on very general random graphs models, GNNs converge to certains functions as the number of nodes grows.
arXiv Detail & Related papers (2023-05-24T07:09:53Z) - OrthoReg: Improving Graph-regularized MLPs via Orthogonality
Regularization [66.30021126251725]
Graph Neural Networks (GNNs) are currently dominating in modeling graphstructure data.
Graph-regularized networks (GR-MLPs) implicitly inject the graph structure information into model weights, while their performance can hardly match that of GNNs in most tasks.
We show that GR-MLPs suffer from dimensional collapse, a phenomenon in which the largest a few eigenvalues dominate the embedding space.
We propose OrthoReg, a novel GR-MLP model to mitigate the dimensional collapse issue.
arXiv Detail & Related papers (2023-01-31T21:20:48Z) - Learning Graph Structure from Convolutional Mixtures [119.45320143101381]
We propose a graph convolutional relationship between the observed and latent graphs, and formulate the graph learning task as a network inverse (deconvolution) problem.
In lieu of eigendecomposition-based spectral methods, we unroll and truncate proximal gradient iterations to arrive at a parameterized neural network architecture that we call a Graph Deconvolution Network (GDN)
GDNs can learn a distribution of graphs in a supervised fashion, perform link prediction or edge-weight regression tasks by adapting the loss function, and they are inherently inductive.
arXiv Detail & Related papers (2022-05-19T14:08:15Z) - Towards Lower Bounds on the Depth of ReLU Neural Networks [7.355977594790584]
We investigate whether the class of exactly representable functions strictly increases by adding more layers.
We settle an old conjecture about piecewise linear functions by Wang and Sun (2005) in the affirmative.
We present upper bounds on the sizes of neural networks required to represent functions with logarithmic depth.
arXiv Detail & Related papers (2021-05-31T09:49:14Z) - Towards Deeper Graph Neural Networks [63.46470695525957]
Graph convolutions perform neighborhood aggregation and represent one of the most important graph operations.
Several recent studies attribute this performance deterioration to the over-smoothing issue.
We propose Deep Adaptive Graph Neural Network (DAGNN) to adaptively incorporate information from large receptive fields.
arXiv Detail & Related papers (2020-07-18T01:11:14Z) - Graph Neural Networks with Composite Kernels [60.81504431653264]
We re-interpret node aggregation from the perspective of kernel weighting.
We present a framework to consider feature similarity in an aggregation scheme.
We propose feature aggregation as the composition of the original neighbor-based kernel and a learnable kernel to encode feature similarities in a feature space.
arXiv Detail & Related papers (2020-05-16T04:44:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.