Gradient flows on graphons: existence, convergence, continuity equations
- URL: http://arxiv.org/abs/2111.09459v3
- Date: Thu, 29 Jun 2023 17:11:22 GMT
- Title: Gradient flows on graphons: existence, convergence, continuity equations
- Authors: Sewoong Oh, Soumik Pal, Raghav Somani, Raghavendra Tripathi
- Abstract summary: Wasserstein gradient flows on probability measures have found a host of applications in various optimization problems.
We show that the Euclidean gradient flow of a suitable function of the edge-weights converges to a novel continuum limit given by a curve on the space of graphons.
- Score: 27.562307342062354
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Wasserstein gradient flows on probability measures have found a host of
applications in various optimization problems. They typically arise as the
continuum limit of exchangeable particle systems evolving by some mean-field
interaction involving a gradient-type potential. However, in many problems,
such as in multi-layer neural networks, the so-called particles are edge
weights on large graphs whose nodes are exchangeable. Such large graphs are
known to converge to continuum limits called graphons as their size grow to
infinity. We show that the Euclidean gradient flow of a suitable function of
the edge-weights converges to a novel continuum limit given by a curve on the
space of graphons that can be appropriately described as a gradient flow or,
more technically, a curve of maximal slope. Several natural functions on
graphons, such as homomorphism functions and the scalar entropy, are covered by
our set-up, and the examples have been worked out in detail.
Related papers
- Adversarial flows: A gradient flow characterization of adversarial attacks [1.8749305679160366]
A popular method to perform adversarial attacks on neuronal networks is the so-called fast gradient sign method.
We show convergence of the discretization to the associated gradient flow.
arXiv Detail & Related papers (2024-06-08T07:05:26Z) - On the Convergence of Gradient Descent for Large Learning Rates [55.33626480243135]
We show that convergence is impossible when a fixed step size is used.
We provide a proof of this in the case of linear neural networks with a squared loss.
We also prove the impossibility of convergence for more general losses without requiring strong assumptions such as Lipschitz continuity for the gradient.
arXiv Detail & Related papers (2024-02-20T16:01:42Z) - Curve Your Attention: Mixed-Curvature Transformers for Graph
Representation Learning [77.1421343649344]
We propose a generalization of Transformers towards operating entirely on the product of constant curvature spaces.
We also provide a kernelized approach to non-Euclidean attention, which enables our model to run in time and memory cost linear to the number of nodes and edges.
arXiv Detail & Related papers (2023-09-08T02:44:37Z) - Path convergence of Markov chains on large graphs [3.693375843298262]
We show that as the size of a graph goes to infinity, the random trajectories of the processes converge to deterministic curves on the space of measure-valued graphons.
A novel feature of this approach is that it provides a precise exponential convergence rate for the Metropolis chain in a certain limiting regime.
arXiv Detail & Related papers (2023-08-18T00:13:59Z) - Geometric Graph Filters and Neural Networks: Limit Properties and
Discriminability Trade-offs [122.06927400759021]
We study the relationship between a graph neural network (GNN) and a manifold neural network (MNN) when the graph is constructed from a set of points sampled from the manifold.
We prove non-asymptotic error bounds showing that convolutional filters and neural networks on these graphs converge to convolutional filters and neural networks on the continuous manifold.
arXiv Detail & Related papers (2023-05-29T08:27:17Z) - Graph Generation with Diffusion Mixture [57.78958552860948]
Generation of graphs is a major challenge for real-world tasks that require understanding the complex nature of their non-Euclidean structures.
We propose a generative framework that models the topology of graphs by explicitly learning the final graph structures of the diffusion process.
arXiv Detail & Related papers (2023-02-07T17:07:46Z) - Stochastic optimization on matrices and a graphon McKean-Vlasov limit [26.906770707395832]
We consider gradient descents on the space of large symmetric matrices of suitable functions that are invariant under permuting the rows and columns using the same permutation.
We establish deterministic limits of these random curves as the dimensions of the matrices go to infinity while the entries remain bounded.
arXiv Detail & Related papers (2022-10-02T04:54:49Z) - Understanding convolution on graphs via energies [23.18124653469668]
Graph Networks (GNNs) typically operate by message-passing, where the state of a node is updated based on the information received from its neighbours.
Most message-passing models act as graph convolutions, where features are mixed by a shared, linear transformation before being propagated over the edges.
On node-classification tasks, graph convolutions have been shown to suffer from two limitations: poor performance on heterophilic graphs, and over-smoothing.
arXiv Detail & Related papers (2022-06-22T11:45:36Z) - Uniform Convergence Rates for Lipschitz Learning on Graphs [1.9014535120129339]
Lipschitz learning is a graph-based semi-supervised learning method.
We prove uniform convergence rates for solutions of the graph infinity Laplace equation.
arXiv Detail & Related papers (2021-11-24T09:44:14Z) - Multilayer Graph Clustering with Optimized Node Embedding [70.1053472751897]
multilayer graph clustering aims at dividing the graph nodes into categories or communities.
We propose a clustering-friendly embedding of the layers of a given multilayer graph.
Experiments show that our method leads to a significant improvement.
arXiv Detail & Related papers (2021-03-30T17:36:40Z) - Multipole Graph Neural Operator for Parametric Partial Differential
Equations [57.90284928158383]
One of the main challenges in using deep learning-based methods for simulating physical systems is formulating physics-based data.
We propose a novel multi-level graph neural network framework that captures interaction at all ranges with only linear complexity.
Experiments confirm our multi-graph network learns discretization-invariant solution operators to PDEs and can be evaluated in linear time.
arXiv Detail & Related papers (2020-06-16T21:56:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.