Beyond spectral gap: The role of the topology in decentralized learning
- URL: http://arxiv.org/abs/2206.03093v1
- Date: Tue, 7 Jun 2022 08:19:06 GMT
- Title: Beyond spectral gap: The role of the topology in decentralized learning
- Authors: Thijs Vogels and Hadrien Hendrikx and Martin Jaggi
- Abstract summary: In data-parallel optimization of machine learning models, workers collaborate to improve their estimates of the model.
This paper aims to paint an accurate picture of sparsely-connected distributed optimization when workers share the same data distribution.
Our theory matches empirical observations in deep learning, and accurately describes the relative merits of different graph topologies.
- Score: 58.48291921602417
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In data-parallel optimization of machine learning models, workers collaborate
to improve their estimates of the model: more accurate gradients allow them to
use larger learning rates and optimize faster. We consider the setting in which
all workers sample from the same dataset, and communicate over a sparse graph
(decentralized). In this setting, current theory fails to capture important
aspects of real-world behavior. First, the 'spectral gap' of the communication
graph is not predictive of its empirical performance in (deep) learning.
Second, current theory does not explain that collaboration enables larger
learning rates than training alone. In fact, it prescribes smaller learning
rates, which further decrease as graphs become larger, failing to explain
convergence in infinite graphs. This paper aims to paint an accurate picture of
sparsely-connected distributed optimization when workers share the same data
distribution. We quantify how the graph topology influences convergence in a
quadratic toy problem and provide theoretical results for general smooth and
(strongly) convex objectives. Our theory matches empirical observations in deep
learning, and accurately describes the relative merits of different graph
topologies.
Related papers
- Understanding Community Bias Amplification in Graph Representation
Learning [22.522798932536038]
We study a phenomenon of community bias amplification in graph representation learning.
We propose a novel graph contrastive learning model called Random Graph Coarsening Contrastive Learning.
arXiv Detail & Related papers (2023-12-08T07:43:05Z) - Beyond spectral gap (extended): The role of the topology in
decentralized learning [58.48291921602417]
In data-parallel optimization of machine learning models, workers collaborate to improve their estimates of the model.
Current theory does not explain that collaboration enables larger learning rates than training alone.
This paper aims to paint an accurate picture of sparsely-connected distributed optimization.
arXiv Detail & Related papers (2023-01-05T16:53:38Z) - Optimal Propagation for Graph Neural Networks [51.08426265813481]
We propose a bi-level optimization approach for learning the optimal graph structure.
We also explore a low-rank approximation model for further reducing the time complexity.
arXiv Detail & Related papers (2022-05-06T03:37:00Z) - Graph Self-supervised Learning with Accurate Discrepancy Learning [64.69095775258164]
We propose a framework that aims to learn the exact discrepancy between the original and the perturbed graphs, coined as Discrepancy-based Self-supervised LeArning (D-SLA)
We validate our method on various graph-related downstream tasks, including molecular property prediction, protein function prediction, and link prediction tasks, on which our model largely outperforms relevant baselines.
arXiv Detail & Related papers (2022-02-07T08:04:59Z) - Deconfounded Training for Graph Neural Networks [98.06386851685645]
We present a new paradigm of decon training (DTP) that better mitigates the confounding effect and latches on the critical information.
Specifically, we adopt the attention modules to disentangle the critical subgraph and trivial subgraph.
It allows GNNs to capture a more reliable subgraph whose relation with the label is robust across different distributions.
arXiv Detail & Related papers (2021-12-30T15:22:35Z) - Learning Graphs from Smooth Signals under Moment Uncertainty [23.868075779606425]
We consider the problem of inferring the graph structure from a given set of graph signals.
Traditional graph learning models do not take this distributional uncertainty into account.
arXiv Detail & Related papers (2021-05-12T06:47:34Z) - Multilayer Clustered Graph Learning [66.94201299553336]
We use contrastive loss as a data fidelity term, in order to properly aggregate the observed layers into a representative graph.
Experiments show that our method leads to a clustered clusters w.r.t.
We learn a clustering algorithm for solving clustering problems.
arXiv Detail & Related papers (2020-10-29T09:58:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.