Understanding convolution on graphs via energies
- URL: http://arxiv.org/abs/2206.10991v5
- Date: Wed, 6 Sep 2023 06:58:11 GMT
- Title: Understanding convolution on graphs via energies
- Authors: Francesco Di Giovanni, James Rowbottom, Benjamin P. Chamberlain,
Thomas Markovich, Michael M. Bronstein
- Abstract summary: Graph Networks (GNNs) typically operate by message-passing, where the state of a node is updated based on the information received from its neighbours.
Most message-passing models act as graph convolutions, where features are mixed by a shared, linear transformation before being propagated over the edges.
On node-classification tasks, graph convolutions have been shown to suffer from two limitations: poor performance on heterophilic graphs, and over-smoothing.
- Score: 23.18124653469668
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Graph Neural Networks (GNNs) typically operate by message-passing, where the
state of a node is updated based on the information received from its
neighbours. Most message-passing models act as graph convolutions, where
features are mixed by a shared, linear transformation before being propagated
over the edges. On node-classification tasks, graph convolutions have been
shown to suffer from two limitations: poor performance on heterophilic graphs,
and over-smoothing. It is common belief that both phenomena occur because such
models behave as low-pass filters, meaning that the Dirichlet energy of the
features decreases along the layers incurring a smoothing effect that
ultimately makes features no longer distinguishable. In this work, we
rigorously prove that simple graph-convolutional models can actually enhance
high frequencies and even lead to an asymptotic behaviour we refer to as
over-sharpening, opposite to over-smoothing. We do so by showing that linear
graph convolutions with symmetric weights minimize a multi-particle energy that
generalizes the Dirichlet energy; in this setting, the weight matrices induce
edge-wise attraction (repulsion) through their positive (negative) eigenvalues,
thereby controlling whether the features are being smoothed or sharpened. We
also extend the analysis to non-linear GNNs, and demonstrate that some existing
time-continuous GNNs are instead always dominated by the low frequencies.
Finally, we validate our theoretical findings through ablations and real-world
experiments.
Related papers
- Almost Surely Asymptotically Constant Graph Neural Networks [7.339728196535312]
We show that the output converges to a constant function, which upper-bounds what these classifiers can uniformly express.
Results apply to a broad class of random graph models.
We empirically validate these findings, observing that the convergence phenomenon appears not only on random graphs but also on some real-world graphs.
arXiv Detail & Related papers (2024-03-06T17:40:26Z) - Graph Generation via Spectral Diffusion [51.60814773299899]
We present GRASP, a novel graph generative model based on 1) the spectral decomposition of the graph Laplacian matrix and 2) a diffusion process.
Specifically, we propose to use a denoising model to sample eigenvectors and eigenvalues from which we can reconstruct the graph Laplacian and adjacency matrix.
Our permutation invariant model can also handle node features by concatenating them to the eigenvectors of each node.
arXiv Detail & Related papers (2024-02-29T09:26:46Z) - Advective Diffusion Transformers for Topological Generalization in Graph
Learning [69.2894350228753]
We show how graph diffusion equations extrapolate and generalize in the presence of varying graph topologies.
We propose a novel graph encoder backbone, Advective Diffusion Transformer (ADiT), inspired by advective graph diffusion equations.
arXiv Detail & Related papers (2023-10-10T08:40:47Z) - Geometric Graph Filters and Neural Networks: Limit Properties and
Discriminability Trade-offs [122.06927400759021]
We study the relationship between a graph neural network (GNN) and a manifold neural network (MNN) when the graph is constructed from a set of points sampled from the manifold.
We prove non-asymptotic error bounds showing that convolutional filters and neural networks on these graphs converge to convolutional filters and neural networks on the continuous manifold.
arXiv Detail & Related papers (2023-05-29T08:27:17Z) - A Fractional Graph Laplacian Approach to Oversmoothing [15.795926248847026]
We generalize the concept of oversmoothing from undirected to directed graphs.
We propose fractional graph Laplacian neural ODEs, which describe non-local dynamics.
Our method is more flexible with respect to the convergence of the graph's Dirichlet energy, thereby mitigating oversmoothing.
arXiv Detail & Related papers (2023-05-22T14:52:33Z) - OrthoReg: Improving Graph-regularized MLPs via Orthogonality
Regularization [66.30021126251725]
Graph Neural Networks (GNNs) are currently dominating in modeling graphstructure data.
Graph-regularized networks (GR-MLPs) implicitly inject the graph structure information into model weights, while their performance can hardly match that of GNNs in most tasks.
We show that GR-MLPs suffer from dimensional collapse, a phenomenon in which the largest a few eigenvalues dominate the embedding space.
We propose OrthoReg, a novel GR-MLP model to mitigate the dimensional collapse issue.
arXiv Detail & Related papers (2023-01-31T21:20:48Z) - A Non-Asymptotic Analysis of Oversmoothing in Graph Neural Networks [33.35609077417775]
We characterize the mechanism behind the phenomenon via a non-asymptotic analysis.
We show that oversmoothing happens once the mixing effect starts to dominate the denoising effect.
Our results suggest that while PPR mitigates oversmoothing at deeper layers, PPR-based architectures still achieve their best performance at a shallow depth.
arXiv Detail & Related papers (2022-12-21T00:33:59Z) - Neural Sheaf Diffusion: A Topological Perspective on Heterophily and
Oversmoothing in GNNs [16.88394293874848]
We use cellular sheaf theory to show that the underlying geometry of the graph is deeply linked with the performance of GNNs.
By considering a hierarchy of increasingly general sheaves, we study how the ability of the sheaf diffusion process to achieve linear separation of the classes in the infinite time limit expands.
We prove that when the sheaf is non-trivial, discretised parametric diffusion processes have greater control than GNNs over their behaviour.
arXiv Detail & Related papers (2022-02-09T17:25:02Z) - Nonlinear State-Space Generalizations of Graph Convolutional Neural
Networks [172.18295279061607]
Graph convolutional neural networks (GCNNs) learn compositional representations from network data by nesting linear graph convolutions into nonlinearities.
In this work, we approach GCNNs from a state-space perspective revealing that the graph convolutional module is a minimalistic linear state-space model.
We show that this state update may be problematic because it is nonparametric, and depending on the graph spectrum it may explode or vanish.
We propose a novel family of nodal aggregation rules that aggregate node features within a layer in a nonlinear state-space parametric fashion allowing for a better trade-off.
arXiv Detail & Related papers (2020-10-27T19:48:56Z) - Dirichlet Graph Variational Autoencoder [65.94744123832338]
We present Dirichlet Graph Variational Autoencoder (DGVAE) with graph cluster memberships as latent factors.
Motivated by the low pass characteristics in balanced graph cut, we propose a new variant of GNN named Heatts to encode the input graph into cluster memberships.
arXiv Detail & Related papers (2020-10-09T07:35:26Z) - A Note on Over-Smoothing for Graph Neural Networks [13.008323851750442]
We build upon previous results citeoono 2019graph to analyze the over-smoothing effect in the general graph neural network architecture.
We show when the weight matrix satisfies the conditions determined by the spectrum of augmented normalized Laplacian, the Dirichlet energy of embeddings will converge to zero, resulting in the loss of discriminative power.
arXiv Detail & Related papers (2020-06-23T20:36:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.