Neural Sheaf Diffusion: A Topological Perspective on Heterophily and
Oversmoothing in GNNs
- URL: http://arxiv.org/abs/2202.04579v1
- Date: Wed, 9 Feb 2022 17:25:02 GMT
- Title: Neural Sheaf Diffusion: A Topological Perspective on Heterophily and
Oversmoothing in GNNs
- Authors: Cristian Bodnar, Francesco Di Giovanni, Benjamin Paul Chamberlain,
Pietro Li\`o, Michael M. Bronstein
- Abstract summary: We use cellular sheaf theory to show that the underlying geometry of the graph is deeply linked with the performance of GNNs.
By considering a hierarchy of increasingly general sheaves, we study how the ability of the sheaf diffusion process to achieve linear separation of the classes in the infinite time limit expands.
We prove that when the sheaf is non-trivial, discretised parametric diffusion processes have greater control than GNNs over their behaviour.
- Score: 16.88394293874848
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Cellular sheaves equip graphs with "geometrical" structure by assigning
vector spaces and linear maps to nodes and edges. Graph Neural Networks (GNNs)
implicitly assume a graph with a trivial underlying sheaf. This choice is
reflected in the structure of the graph Laplacian operator, the properties of
the associated diffusion equation, and the characteristics of the convolutional
models that discretise this equation. In this paper, we use cellular sheaf
theory to show that the underlying geometry of the graph is deeply linked with
the performance of GNNs in heterophilic settings and their oversmoothing
behaviour. By considering a hierarchy of increasingly general sheaves, we study
how the ability of the sheaf diffusion process to achieve linear separation of
the classes in the infinite time limit expands. At the same time, we prove that
when the sheaf is non-trivial, discretised parametric diffusion processes have
greater control than GNNs over their asymptotic behaviour. On the practical
side, we study how sheaves can be learned from data. The resulting sheaf
diffusion models have many desirable properties that address the limitations of
classical graph diffusion equations (and corresponding GNN models) and obtain
state-of-the-art results in heterophilic settings. Overall, our work provides
new connections between GNNs and algebraic topology and would be of interest to
both fields.
Related papers
- Generalization of Geometric Graph Neural Networks [84.01980526069075]
We study the generalization capabilities of geometric graph neural networks (GNNs)
We prove a generalization gap between the optimal empirical risk and the optimal statistical risk of this GNN.
The most important observation is that the generalization capability can be realized with one large graph instead of being limited to the size of the graph as in previous results.
arXiv Detail & Related papers (2024-09-08T18:55:57Z) - Graph Classification with GNNs: Optimisation, Representation and Inductive Bias [0.6445605125467572]
We argue that such equivalence ignores the accompanying optimization issues and does not provide a holistic view of the GNN learning process.
We prove theoretically that the message-passing layers in the graph, have a tendency to search for either discriminative subgraphs, or a collection of discriminative nodes dispersed across the graph.
arXiv Detail & Related papers (2024-08-17T18:15:44Z) - LSEnet: Lorentz Structural Entropy Neural Network for Deep Graph Clustering [59.89626219328127]
Graph clustering is a fundamental problem in machine learning.
Deep learning methods achieve the state-of-the-art results in recent years, but they still cannot work without predefined cluster numbers.
We propose to address this problem from a fresh perspective of graph information theory.
arXiv Detail & Related papers (2024-05-20T05:46:41Z) - Graph Neural Aggregation-diffusion with Metastability [4.040326569845733]
Continuous graph neural models based on differential equations have expanded the architecture of graph neural networks (GNNs)
We propose GRADE inspired by graph aggregation-diffusion equations, which includes the delicate balance between nonlinear diffusion and aggregation induced by interaction potentials.
We prove that GRADE achieves competitive performance across various benchmarks and alleviates the over-smoothing issue in GNNs.
arXiv Detail & Related papers (2024-03-29T15:05:57Z) - Advective Diffusion Transformers for Topological Generalization in Graph
Learning [69.2894350228753]
We show how graph diffusion equations extrapolate and generalize in the presence of varying graph topologies.
We propose a novel graph encoder backbone, Advective Diffusion Transformer (ADiT), inspired by advective graph diffusion equations.
arXiv Detail & Related papers (2023-10-10T08:40:47Z) - Geometric Graph Filters and Neural Networks: Limit Properties and
Discriminability Trade-offs [122.06927400759021]
We study the relationship between a graph neural network (GNN) and a manifold neural network (MNN) when the graph is constructed from a set of points sampled from the manifold.
We prove non-asymptotic error bounds showing that convolutional filters and neural networks on these graphs converge to convolutional filters and neural networks on the continuous manifold.
arXiv Detail & Related papers (2023-05-29T08:27:17Z) - Graph Generation with Diffusion Mixture [57.78958552860948]
Generation of graphs is a major challenge for real-world tasks that require understanding the complex nature of their non-Euclidean structures.
We propose a generative framework that models the topology of graphs by explicitly learning the final graph structures of the diffusion process.
arXiv Detail & Related papers (2023-02-07T17:07:46Z) - Capturing Graphs with Hypo-Elliptic Diffusions [7.704064306361941]
We show that the distribution of random walks evolves according to a diffusion equation defined using the graph Laplacian.
This results in a novel tensor-valued graph operator, which we call the hypo-elliptic graph Laplacian.
We show that this method competes with graph transformers on datasets requiring long-range reasoning but scales only linearly in the number of edges.
arXiv Detail & Related papers (2022-05-27T16:47:34Z) - Graph Kernel Neural Networks [53.91024360329517]
We propose to use graph kernels, i.e. kernel functions that compute an inner product on graphs, to extend the standard convolution operator to the graph domain.
This allows us to define an entirely structural model that does not require computing the embedding of the input graph.
Our architecture allows to plug-in any type of graph kernels and has the added benefit of providing some interpretability.
arXiv Detail & Related papers (2021-12-14T14:48:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.