On the effects of similarity metrics in decentralized deep learning under distributional shift
- URL: http://arxiv.org/abs/2409.10720v1
- Date: Mon, 16 Sep 2024 20:48:16 GMT
- Title: On the effects of similarity metrics in decentralized deep learning under distributional shift
- Authors: Edvin Listo Zec, Tom Hagander, Eric Ihre-Thomason, Sarunas Girdzijauskas,
- Abstract summary: Decentralized Learning (DL) enables privacy-preserving collaboration among organizations or users.
In this paper, we investigate the effectiveness of various similarity metrics in DL for identifying peers for model merging.
- Score: 2.6763602268733626
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Decentralized Learning (DL) enables privacy-preserving collaboration among organizations or users to enhance the performance of local deep learning models. However, model aggregation becomes challenging when client data is heterogeneous, and identifying compatible collaborators without direct data exchange remains a pressing issue. In this paper, we investigate the effectiveness of various similarity metrics in DL for identifying peers for model merging, conducting an empirical analysis across multiple datasets with distribution shifts. Our research provides insights into the performance of these metrics, examining their role in facilitating effective collaboration. By exploring the strengths and limitations of these metrics, we contribute to the development of robust DL methods.
Related papers
- Decentralized Personalized Federated Learning [4.5836393132815045]
We focus on creating a collaboration graph that guides each client in selecting suitable collaborators for training personalized models.
Unlike traditional methods, our formulation identifies collaborators at a granular level by considering greedy relations of clients.
We achieve this through a bi-level optimization framework that employs a constrained algorithm.
arXiv Detail & Related papers (2024-06-10T17:58:48Z) - PairCFR: Enhancing Model Training on Paired Counterfactually Augmented Data through Contrastive Learning [49.60634126342945]
Counterfactually Augmented Data (CAD) involves creating new data samples by applying minimal yet sufficient modifications to flip the label of existing data samples to other classes.
Recent research reveals that training with CAD may lead models to overly focus on modified features while ignoring other important contextual information.
We employ contrastive learning to promote global feature alignment in addition to learning counterfactual clues.
arXiv Detail & Related papers (2024-06-09T07:29:55Z) - Distributed Continual Learning [12.18012293738896]
We introduce a mathematical framework capturing the essential aspects of distributed continual learning.
We identify three modes of information exchange: data instances, full model parameters, and modular (partial) model parameters.
Our findings reveal three key insights: sharing parameters is more efficient than sharing data as tasks become more complex.
arXiv Detail & Related papers (2024-05-23T21:24:26Z) - Causal Coordinated Concurrent Reinforcement Learning [8.654978787096807]
We propose a novel algorithmic framework for data sharing and coordinated exploration for the purpose of learning more data-efficient and better performing policies under a concurrent reinforcement learning setting.
Our algorithm leverages a causal inference algorithm in the form of Additive Noise Model - Mixture Model (ANM-MM) in extracting model parameters governing individual differentials via independence enforcement.
We propose a new data sharing scheme based on a similarity measure of the extracted model parameters and demonstrate superior learning speeds on a set of autoregressive, pendulum and cart-pole swing-up tasks.
arXiv Detail & Related papers (2024-01-31T17:20:28Z) - Mitigating Shortcut Learning with Diffusion Counterfactuals and Diverse Ensembles [95.49699178874683]
We propose DiffDiv, an ensemble diversification framework exploiting Diffusion Probabilistic Models (DPMs)
We show that DPMs can generate images with novel feature combinations, even when trained on samples displaying correlated input features.
We show that DPM-guided diversification is sufficient to remove dependence on shortcut cues, without a need for additional supervised signals.
arXiv Detail & Related papers (2023-11-23T15:47:33Z) - Leveraging Diffusion Disentangled Representations to Mitigate Shortcuts
in Underspecified Visual Tasks [92.32670915472099]
We propose an ensemble diversification framework exploiting the generation of synthetic counterfactuals using Diffusion Probabilistic Models (DPMs)
We show that diffusion-guided diversification can lead models to avert attention from shortcut cues, achieving ensemble diversity performance comparable to previous methods requiring additional data collection.
arXiv Detail & Related papers (2023-10-03T17:37:52Z) - Evaluating and Incentivizing Diverse Data Contributions in Collaborative
Learning [89.21177894013225]
For a federated learning model to perform well, it is crucial to have a diverse and representative dataset.
We show that the statistical criterion used to quantify the diversity of the data, as well as the choice of the federated learning algorithm used, has a significant effect on the resulting equilibrium.
We leverage this to design simple optimal federated learning mechanisms that encourage data collectors to contribute data representative of the global population.
arXiv Detail & Related papers (2023-06-08T23:38:25Z) - Collaborative Learning via Prediction Consensus [38.89001892487472]
We consider a collaborative learning setting where the goal of each agent is to improve their own model by leveraging the expertise of collaborators.
We propose a distillation-based method leveraging shared unlabeled auxiliary data, which is pseudo-labeled by the collective.
We demonstrate empirically that our collaboration scheme is able to significantly boost the performance of individual models.
arXiv Detail & Related papers (2023-05-29T14:12:03Z) - Improving GANs with A Dynamic Discriminator [106.54552336711997]
We argue that a discriminator with an on-the-fly adjustment on its capacity can better accommodate such a time-varying task.
A comprehensive empirical study confirms that the proposed training strategy, termed as DynamicD, improves the synthesis performance without incurring any additional cost or training objectives.
arXiv Detail & Related papers (2022-09-20T17:57:33Z) - Deep Stable Learning for Out-Of-Distribution Generalization [27.437046504902938]
Approaches based on deep neural networks have achieved striking performance when testing data and training data share similar distribution.
Eliminating the impact of distribution shifts between training and testing data is crucial for building performance-promising deep models.
We propose to address this problem by removing the dependencies between features via learning weights for training samples.
arXiv Detail & Related papers (2021-04-16T03:54:21Z) - Personalized Cross-Silo Federated Learning on Non-IID Data [62.68467223450439]
Non-IID data present a tough challenge for federated learning.
We propose a novel idea of pairwise collaborations between clients with similar data.
arXiv Detail & Related papers (2020-07-07T21:38:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.