Beyond spectral gap (extended): The role of the topology in
decentralized learning
- URL: http://arxiv.org/abs/2301.02151v1
- Date: Thu, 5 Jan 2023 16:53:38 GMT
- Title: Beyond spectral gap (extended): The role of the topology in
decentralized learning
- Authors: Thijs Vogels, Hadrien Hendrikx, Martin Jaggi
- Abstract summary: In data-parallel optimization of machine learning models, workers collaborate to improve their estimates of the model.
Current theory does not explain that collaboration enables larger learning rates than training alone.
This paper aims to paint an accurate picture of sparsely-connected distributed optimization.
- Score: 58.48291921602417
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In data-parallel optimization of machine learning models, workers collaborate
to improve their estimates of the model: more accurate gradients allow them to
use larger learning rates and optimize faster. In the decentralized setting, in
which workers communicate over a sparse graph, current theory fails to capture
important aspects of real-world behavior. First, the `spectral gap' of the
communication graph is not predictive of its empirical performance in (deep)
learning. Second, current theory does not explain that collaboration enables
larger learning rates than training alone. In fact, it prescribes smaller
learning rates, which further decrease as graphs become larger, failing to
explain convergence dynamics in infinite graphs. This paper aims to paint an
accurate picture of sparsely-connected distributed optimization. We quantify
how the graph topology influences convergence in a quadratic toy problem and
provide theoretical results for general smooth and (strongly) convex
objectives. Our theory matches empirical observations in deep learning, and
accurately describes the relative merits of different graph topologies. This
paper is an extension of the conference paper by Vogels et. al. (2022). Code:
https://github.com/epfml/topology-in-decentralized-learning.
Related papers
- Beyond spectral gap: The role of the topology in decentralized learning [58.48291921602417]
In data-parallel optimization of machine learning models, workers collaborate to improve their estimates of the model.
This paper aims to paint an accurate picture of sparsely-connected distributed optimization when workers share the same data distribution.
Our theory matches empirical observations in deep learning, and accurately describes the relative merits of different graph topologies.
arXiv Detail & Related papers (2022-06-07T08:19:06Z) - Optimal Propagation for Graph Neural Networks [51.08426265813481]
We propose a bi-level optimization approach for learning the optimal graph structure.
We also explore a low-rank approximation model for further reducing the time complexity.
arXiv Detail & Related papers (2022-05-06T03:37:00Z) - Graph Self-supervised Learning with Accurate Discrepancy Learning [64.69095775258164]
We propose a framework that aims to learn the exact discrepancy between the original and the perturbed graphs, coined as Discrepancy-based Self-supervised LeArning (D-SLA)
We validate our method on various graph-related downstream tasks, including molecular property prediction, protein function prediction, and link prediction tasks, on which our model largely outperforms relevant baselines.
arXiv Detail & Related papers (2022-02-07T08:04:59Z) - Learning Representations of Entities and Relations [0.0]
This thesis focuses on improving knowledge graph representation with the aim of tackling the link prediction task.
The first contribution is HypER, a convolutional model which simplifies and improves upon the link prediction performance.
The second contribution is TuckER, a relatively straightforward linear model, which, at the time of its introduction, obtained state-of-the-art link prediction performance.
The third contribution is MuRP, first multi-relational graph representation model embedded in hyperbolic space.
arXiv Detail & Related papers (2022-01-31T09:24:43Z) - Generating the Graph Gestalt: Kernel-Regularized Graph Representation
Learning [47.506013386710954]
A complete scientific understanding of graph data should address both global and local structure.
We propose a joint model for both as complementary objectives in a graph VAE framework.
Our experiments demonstrate a significant improvement in the realism of the generated graph structures, typically by 1-2 orders of magnitude of graph structure metrics.
arXiv Detail & Related papers (2021-06-29T10:48:28Z) - Multilayer Clustered Graph Learning [66.94201299553336]
We use contrastive loss as a data fidelity term, in order to properly aggregate the observed layers into a representative graph.
Experiments show that our method leads to a clustered clusters w.r.t.
We learn a clustering algorithm for solving clustering problems.
arXiv Detail & Related papers (2020-10-29T09:58:02Z) - Towards Deeper Graph Neural Networks [63.46470695525957]
Graph convolutions perform neighborhood aggregation and represent one of the most important graph operations.
Several recent studies attribute this performance deterioration to the over-smoothing issue.
We propose Deep Adaptive Graph Neural Network (DAGNN) to adaptively incorporate information from large receptive fields.
arXiv Detail & Related papers (2020-07-18T01:11:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.