Communication-Efficient Robust Federated Learning Over Heterogeneous
Datasets
- URL: http://arxiv.org/abs/2006.09992v3
- Date: Wed, 19 Aug 2020 06:00:38 GMT
- Title: Communication-Efficient Robust Federated Learning Over Heterogeneous
Datasets
- Authors: Yanjie Dong and Georgios B. Giannakis and Tianyi Chen and Julian Cheng
and Md. Jahangir Hossain and Victor C. M. Leung
- Abstract summary: This work investigates fault-resilient federated learning when the data samples are non-uniformly distributed across workers.
In the presence of adversarially faulty workers who may strategically corrupt datasets, the local messages exchanged can be unreliable.
The present work introduces a fault-resilient gradient (FRPG) algorithm that relies on Nesterov's acceleration technique.
For strongly convex loss functions, FRPG and LFRPG have provably faster convergence rates than a benchmark robust aggregation algorithm.
- Score: 147.11434031193164
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This work investigates fault-resilient federated learning when the data
samples are non-uniformly distributed across workers, and the number of faulty
workers is unknown to the central server. In the presence of adversarially
faulty workers who may strategically corrupt datasets, the local messages
exchanged (e.g., local gradients and/or local model parameters) can be
unreliable, and thus the vanilla stochastic gradient descent (SGD) algorithm is
not guaranteed to converge. Recently developed algorithms improve upon vanilla
SGD by providing robustness to faulty workers at the price of slowing down
convergence. To remedy this limitation, the present work introduces a
fault-resilient proximal gradient (FRPG) algorithm that relies on Nesterov's
acceleration technique. To reduce the communication overhead of FRPG, a local
(L) FRPG algorithm is also developed to allow for intermittent server-workers
parameter exchanges. For strongly convex loss functions, FRPG and LFRPG have
provably faster convergence rates than a benchmark robust stochastic
aggregation algorithm. Moreover, LFRPG converges faster than FRPG while using
the same communication rounds. Numerical tests performed on various real
datasets confirm the accelerated convergence of FRPG and LFRPG over the robust
stochastic aggregation benchmark and competing alternatives.
Related papers
- FedScalar: A Communication efficient Federated Learning [0.0]
Federated learning (FL) has gained considerable popularity for distributed machine learning.
emphFedScalar enables agents to communicate updates using a single scalar.
arXiv Detail & Related papers (2024-10-03T07:06:49Z) - Asynchronous Federated Stochastic Optimization for Heterogeneous Objectives Under Arbitrary Delays [0.0]
Federated learning (FL) was recently proposed to securely train models with data held over multiple locations ("clients")
Two major challenges hindering the performance of FL algorithms are long training times caused by straggling clients, and a decline in model accuracy under non-iid local data distributions ("client drift")
We propose and analyze Asynchronous Exact Averaging (AREA), a new (sub)gradient algorithm that utilizes communication to speed up convergence and enhance scalability, and employs client memory to correct the client drift caused by variations in client update frequencies.
arXiv Detail & Related papers (2024-05-16T14:22:49Z) - FedAWARE: Maximizing Gradient Diversity for Heterogeneous Federated Server-side Optimization [37.743911787044475]
textscFedAWARE can enhance the performance of FL algorithms as a plug-in module.
textscFedAWARE can enhance the performance of FL algorithms as a plug-in module.
arXiv Detail & Related papers (2023-10-04T10:15:57Z) - DRAG: Divergence-based Adaptive Aggregation in Federated learning on
Non-IID Data [11.830891255837788]
Local gradient descent (SGD) is a fundamental approach in achieving communication efficiency in Federated Learning (FL)
We introduce a novel metric called degree of divergence," quantifying the angle between the local gradient and the global reference direction.
We propose the divergence-based adaptive aggregation (DRAG) algorithm, which dynamically drags" the received local updates toward the reference direction in each round without requiring extra communication overhead.
arXiv Detail & Related papers (2023-09-04T19:40:58Z) - FedSkip: Combatting Statistical Heterogeneity with Federated Skip
Aggregation [95.85026305874824]
We introduce a data-driven approach called FedSkip to improve the client optima by periodically skipping federated averaging and scattering local models to the cross devices.
We conduct extensive experiments on a range of datasets to demonstrate that FedSkip achieves much higher accuracy, better aggregation efficiency and competing communication efficiency.
arXiv Detail & Related papers (2022-12-14T13:57:01Z) - TCT: Convexifying Federated Learning using Bootstrapped Neural Tangent
Kernels [141.29156234353133]
State-of-the-art convex learning methods can perform far worse than their centralized counterparts when clients have dissimilar data distributions.
We show this disparity can largely be attributed to challenges presented by non-NISTity.
We propose a Train-Convexify neural network (TCT) procedure to sidestep this issue.
arXiv Detail & Related papers (2022-07-13T16:58:22Z) - Acceleration of Federated Learning with Alleviated Forgetting in Local
Training [61.231021417674235]
Federated learning (FL) enables distributed optimization of machine learning models while protecting privacy.
We propose FedReg, an algorithm to accelerate FL with alleviated knowledge forgetting in the local training stage.
Our experiments demonstrate that FedReg not only significantly improves the convergence rate of FL, especially when the neural network architecture is deep.
arXiv Detail & Related papers (2022-03-05T02:31:32Z) - Speeding up Heterogeneous Federated Learning with Sequentially Trained
Superclients [19.496278017418113]
Federated Learning (FL) allows training machine learning models in privacy-constrained scenarios by enabling the cooperation of edge devices without requiring local data sharing.
This approach raises several challenges due to the different statistical distribution of the local datasets and the clients' computational heterogeneity.
We propose FedSeq, a novel framework leveraging the sequential training of subgroups of heterogeneous clients, i.e. superclients, to emulate the centralized paradigm in a privacy-compliant way.
arXiv Detail & Related papers (2022-01-26T12:33:23Z) - Faster Non-Convex Federated Learning via Global and Local Momentum [57.52663209739171]
textttFedGLOMO is the first (first-order) FLtexttFedGLOMO algorithm.
Our algorithm is provably optimal even with communication between the clients and the server.
arXiv Detail & Related papers (2020-12-07T21:05:31Z) - Learning while Respecting Privacy and Robustness to Distributional
Uncertainties and Adversarial Data [66.78671826743884]
The distributionally robust optimization framework is considered for training a parametric model.
The objective is to endow the trained model with robustness against adversarially manipulated input data.
Proposed algorithms offer robustness with little overhead.
arXiv Detail & Related papers (2020-07-07T18:25:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.