Robust and Efficient Aggregation for Distributed Learning
- URL: http://arxiv.org/abs/2204.00586v1
- Date: Fri, 1 Apr 2022 17:17:41 GMT
- Title: Robust and Efficient Aggregation for Distributed Learning
- Authors: Stefan Vlaski, Christian Schroth, Michael Muma, Abdelhak M. Zoubir
- Abstract summary: Distributed learning schemes based on averaging are known to be susceptible to outliers.
A single malicious agent is able to drive an averaging-based distributed learning algorithm to an arbitrarily poor model.
This has motivated the development of robust aggregation schemes, which are based on variations of the median and trimmed mean.
- Score: 37.203175053625245
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Distributed learning paradigms, such as federated and decentralized learning,
allow for the coordination of models across a collection of agents, and without
the need to exchange raw data. Instead, agents compute model updates locally
based on their available data, and subsequently share the update model with a
parameter server or their peers. This is followed by an aggregation step, which
traditionally takes the form of a (weighted) average. Distributed learning
schemes based on averaging are known to be susceptible to outliers. A single
malicious agent is able to drive an averaging-based distributed learning
algorithm to an arbitrarily poor model. This has motivated the development of
robust aggregation schemes, which are based on variations of the median and
trimmed mean. While such procedures ensure robustness to outliers and malicious
behavior, they come at the cost of significantly reduced sample efficiency.
This means that current robust aggregation schemes require significantly higher
agent participation rates to achieve a given level of performance than their
mean-based counterparts in non-contaminated settings. In this work we remedy
this drawback by developing statistically efficient and robust aggregation
schemes for distributed learning.
Related papers
- WASH: Train your Ensemble with Communication-Efficient Weight Shuffling, then Average [21.029085451757368]
Weight averaging methods aim at balancing the generalization of ensembling and the inference speed of a single model.
We introduce WASH, a novel distributed method for training model ensembles for weight averaging that achieves state-of-the-art image classification accuracy.
arXiv Detail & Related papers (2024-05-27T09:02:57Z) - Vanishing Variance Problem in Fully Decentralized Neural-Network Systems [0.8212195887472242]
Federated learning and gossip learning are emerging methodologies designed to mitigate data privacy concerns.
Our research introduces a variance-corrected model averaging algorithm.
Our simulation results demonstrate that our approach enables gossip learning to achieve convergence efficiency comparable to that of federated learning.
arXiv Detail & Related papers (2024-04-06T12:49:20Z) - Attacks on Robust Distributed Learning Schemes via Sensitivity Curve
Maximization [37.464005524259356]
We present a new attack based on sensitivity of curve (SCM)
We demonstrate that it is able to disrupt existing robust aggregation schemes by injecting small but effective perturbations.
arXiv Detail & Related papers (2023-04-27T08:41:57Z) - Federated Learning Aggregation: New Robust Algorithms with Guarantees [63.96013144017572]
Federated learning has been recently proposed for distributed model training at the edge.
This paper presents a complete general mathematical convergence analysis to evaluate aggregation strategies in a federated learning framework.
We derive novel aggregation algorithms which are able to modify their model architecture by differentiating client contributions according to the value of their losses.
arXiv Detail & Related papers (2022-05-22T16:37:53Z) - Distributionally Robust Models with Parametric Likelihood Ratios [123.05074253513935]
Three simple ideas allow us to train models with DRO using a broader class of parametric likelihood ratios.
We find that models trained with the resulting parametric adversaries are consistently more robust to subpopulation shifts when compared to other DRO approaches.
arXiv Detail & Related papers (2022-04-13T12:43:12Z) - Examining and Combating Spurious Features under Distribution Shift [94.31956965507085]
We define and analyze robust and spurious representations using the information-theoretic concept of minimal sufficient statistics.
We prove that even when there is only bias of the input distribution, models can still pick up spurious features from their training data.
Inspired by our analysis, we demonstrate that group DRO can fail when groups do not directly account for various spurious correlations.
arXiv Detail & Related papers (2021-06-14T05:39:09Z) - Coded Stochastic ADMM for Decentralized Consensus Optimization with Edge
Computing [113.52575069030192]
Big data, including applications with high security requirements, are often collected and stored on multiple heterogeneous devices, such as mobile devices, drones and vehicles.
Due to the limitations of communication costs and security requirements, it is of paramount importance to extract information in a decentralized manner instead of aggregating data to a fusion center.
We consider the problem of learning model parameters in a multi-agent system with data locally processed via distributed edge nodes.
A class of mini-batch alternating direction method of multipliers (ADMM) algorithms is explored to develop the distributed learning model.
arXiv Detail & Related papers (2020-10-02T10:41:59Z) - Learning Diverse Representations for Fast Adaptation to Distribution
Shift [78.83747601814669]
We present a method for learning multiple models, incorporating an objective that pressures each to learn a distinct way to solve the task.
We demonstrate our framework's ability to facilitate rapid adaptation to distribution shift.
arXiv Detail & Related papers (2020-06-12T12:23:50Z) - Cluster-Based Social Reinforcement Learning [16.821802372973004]
Social Reinforcement Learning methods are useful for fake news mitigation, personalized teaching/healthcare, and viral marketing.
It is challenging to incorporate inter-agent dependencies into the models effectively due to network size and sparse interaction data.
Previous social RL approaches either ignore agents dependencies or model them in a computationally intensive manner.
arXiv Detail & Related papers (2020-03-02T01:55:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.