Data-heterogeneity-aware Mixing for Decentralized Learning
- URL: http://arxiv.org/abs/2204.06477v1
- Date: Wed, 13 Apr 2022 15:54:35 GMT
- Title: Data-heterogeneity-aware Mixing for Decentralized Learning
- Authors: Yatin Dandi, Anastasia Koloskova, Martin Jaggi, Sebastian U. Stich
- Abstract summary: We characterize the dependence of convergence on the relationship between the mixing weights of the graph and the data heterogeneity across nodes.
We propose a metric that quantifies the ability of a graph to mix the current gradients.
Motivated by our analysis, we propose an approach that periodically and efficiently optimize the metric.
- Score: 63.83913592085953
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Decentralized learning provides an effective framework to train machine
learning models with data distributed over arbitrary communication graphs.
However, most existing approaches toward decentralized learning disregard the
interaction between data heterogeneity and graph topology. In this paper, we
characterize the dependence of convergence on the relationship between the
mixing weights of the graph and the data heterogeneity across nodes. We propose
a metric that quantifies the ability of a graph to mix the current gradients.
We further prove that the metric controls the convergence rate, particularly in
settings where the heterogeneity across nodes dominates the stochasticity
between updates for a given node. Motivated by our analysis, we propose an
approach that periodically and efficiently optimizes the metric using standard
convex constrained optimization and sketching techniques. Through comprehensive
experiments on standard computer vision and NLP benchmarks, we show that our
approach leads to improvement in test performance for a wide range of tasks.
Related papers
- NTK-DFL: Enhancing Decentralized Federated Learning in Heterogeneous Settings via Neural Tangent Kernel [27.92271597111756]
Decentralized federated learning (DFL) is a collaborative machine learning framework for training a model across participants without a central server or raw data exchange.
Recent work has shown that the neural tangent kernel (NTK) approach, when applied to federated learning in a centralized framework, can lead to improved performance.
We propose an approach leveraging the NTK to train client models in the decentralized setting, while introducing a synergy between NTK-based evolution and model averaging.
arXiv Detail & Related papers (2024-10-02T18:19:28Z) - Modularity aided consistent attributed graph clustering via coarsening [6.522020196906943]
Graph clustering is an important unsupervised learning technique for partitioning graphs with attributes and detecting communities.
We propose a loss function incorporating log-determinant, smoothness, and modularity components using a block majorization-minimization technique.
Our algorithm seamlessly integrates graph neural networks (GNNs) and variational graph autoencoders (VGAEs) to learn enhanced node features and deliver exceptional clustering performance.
arXiv Detail & Related papers (2024-07-09T10:42:19Z) - Graph Out-of-Distribution Generalization with Controllable Data
Augmentation [51.17476258673232]
Graph Neural Network (GNN) has demonstrated extraordinary performance in classifying graph properties.
Due to the selection bias of training and testing data, distribution deviation is widespread.
We propose OOD calibration to measure the distribution deviation of virtual samples.
arXiv Detail & Related papers (2023-08-16T13:10:27Z) - Towards Relation-centered Pooling and Convolution for Heterogeneous
Graph Learning Networks [11.421162988355146]
Heterogeneous graph neural network has unleashed great potential on graph representation learning.
We design a relation-centered Pooling and Convolution for Heterogeneous Graph learning Network, namely PC-HGN, to enable relation-specific sampling and cross-relation convolutions.
We evaluate the performance of the proposed model by comparing with state-of-the-art graph learning models on three different real-world datasets.
arXiv Detail & Related papers (2022-10-31T08:43:32Z) - Interpolation-based Correlation Reduction Network for Semi-Supervised
Graph Learning [49.94816548023729]
We propose a novel graph contrastive learning method, termed Interpolation-based Correlation Reduction Network (ICRN)
In our method, we improve the discriminative capability of the latent feature by enlarging the margin of decision boundaries.
By combining the two settings, we extract rich supervision information from both the abundant unlabeled nodes and the rare yet valuable labeled nodes for discnative representation learning.
arXiv Detail & Related papers (2022-06-06T14:26:34Z) - Deep Graph Clustering via Mutual Information Maximization and Mixture
Model [6.488575826304023]
We introduce a contrastive learning framework for learning clustering-friendly node embedding.
Experiments on real-world datasets demonstrate the effectiveness of our method in community detection.
arXiv Detail & Related papers (2022-05-10T21:03:55Z) - Optimal Propagation for Graph Neural Networks [51.08426265813481]
We propose a bi-level optimization approach for learning the optimal graph structure.
We also explore a low-rank approximation model for further reducing the time complexity.
arXiv Detail & Related papers (2022-05-06T03:37:00Z) - Score-based Generative Modeling of Graphs via the System of Stochastic
Differential Equations [57.15855198512551]
We propose a novel score-based generative model for graphs with a continuous-time framework.
We show that our method is able to generate molecules that lie close to the training distribution yet do not violate the chemical valency rule.
arXiv Detail & Related papers (2022-02-05T08:21:04Z) - Bayesian Graph Contrastive Learning [55.36652660268726]
We propose a novel perspective of graph contrastive learning methods showing random augmentations leads to encoders.
Our proposed method represents each node by a distribution in the latent space in contrast to existing techniques which embed each node to a deterministic vector.
We show a considerable improvement in performance compared to existing state-of-the-art methods on several benchmark datasets.
arXiv Detail & Related papers (2021-12-15T01:45:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.