Communication-Efficient Stochastic Zeroth-Order Optimization for
Federated Learning
- URL: http://arxiv.org/abs/2201.09531v1
- Date: Mon, 24 Jan 2022 08:56:06 GMT
- Title: Communication-Efficient Stochastic Zeroth-Order Optimization for
Federated Learning
- Authors: Wenzhi Fang, Ziyi Yu, Yuning Jiang, Yuanming Shi, Colin N. Jones, and
Yong Zhou
- Abstract summary: Federated learning (FL) enables edge devices to collaboratively train a global model without sharing their private data.
To enhance the training efficiency of FL, various algorithms have been proposed, ranging from first-order computation to first-order methods.
- Score: 28.65635956111857
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Federated learning (FL), as an emerging edge artificial intelligence
paradigm, enables many edge devices to collaboratively train a global model
without sharing their private data. To enhance the training efficiency of FL,
various algorithms have been proposed, ranging from first-order to second-order
methods. However, these algorithms cannot be applied in scenarios where the
gradient information is not available, e.g., federated black-box attack and
federated hyperparameter tuning. To address this issue, in this paper we
propose a derivative-free federated zeroth-order optimization (FedZO) algorithm
featured by performing multiple local updates based on stochastic gradient
estimators in each communication round and enabling partial device
participation. Under the non-convex setting, we derive the convergence
performance of the FedZO algorithm and characterize the impact of the numbers
of local iterates and participating edge devices on the convergence. To enable
communication-efficient FedZO over wireless networks, we further propose an
over-the-air computation (AirComp) assisted FedZO algorithm. With an
appropriate transceiver design, we show that the convergence of
AirComp-assisted FedZO can still be preserved under certain signal-to-noise
ratio conditions. Simulation results demonstrate the effectiveness of the FedZO
algorithm and validate the theoretical observations.
Related papers
- Over-the-Air Federated Learning and Optimization [52.5188988624998]
We focus on Federated learning (FL) via edge-the-air computation (AirComp)
We describe the convergence of AirComp-based FedAvg (AirFedAvg) algorithms under both convex and non- convex settings.
For different types of local updates that can be transmitted by edge devices (i.e., model, gradient, model difference), we reveal that transmitting in AirFedAvg may cause an aggregation error.
In addition, we consider more practical signal processing schemes to improve the communication efficiency and extend the convergence analysis to different forms of model aggregation error caused by these signal processing schemes.
arXiv Detail & Related papers (2023-10-16T05:49:28Z) - Federated Conditional Stochastic Optimization [110.513884892319]
Conditional optimization has found in a wide range of machine learning tasks, such as in-variant learning tasks, AUPRC, andAML.
This paper proposes algorithms for distributed federated learning.
arXiv Detail & Related papers (2023-10-04T01:47:37Z) - Stochastic Unrolled Federated Learning [85.6993263983062]
We introduce UnRolled Federated learning (SURF), a method that expands algorithm unrolling to federated learning.
Our proposed method tackles two challenges of this expansion, namely the need to feed whole datasets to the unrolleds and the decentralized nature of federated learning.
arXiv Detail & Related papers (2023-05-24T17:26:22Z) - Vertical Federated Learning over Cloud-RAN: Convergence Analysis and
System Optimization [82.12796238714589]
We propose a novel cloud radio access network (Cloud-RAN) based vertical FL system to enable fast and accurate model aggregation.
We characterize the convergence behavior of the vertical FL algorithm considering both uplink and downlink transmissions.
We establish a system optimization framework by joint transceiver and fronthaul quantization design, for which successive convex approximation and alternate convex search based system optimization algorithms are developed.
arXiv Detail & Related papers (2023-05-04T09:26:03Z) - Gradient Sparsification for Efficient Wireless Federated Learning with
Differential Privacy [25.763777765222358]
Federated learning (FL) enables distributed clients to collaboratively train a machine learning model without sharing raw data with each other.
As the model size grows, the training latency due to limited transmission bandwidth and private information degrades while using differential privacy (DP) protection.
We propose sparsification empowered FL framework wireless channels, in over to improve training efficiency without sacrificing convergence performance.
arXiv Detail & Related papers (2023-04-09T05:21:15Z) - Faster Adaptive Federated Learning [84.38913517122619]
Federated learning has attracted increasing attention with the emergence of distributed data.
In this paper, we propose an efficient adaptive algorithm (i.e., FAFED) based on momentum-based variance reduced technique in cross-silo FL.
arXiv Detail & Related papers (2022-12-02T05:07:50Z) - Over-the-Air Federated Learning via Second-Order Optimization [37.594140209854906]
Federated learning (FL) could result in task-oriented data traffic flows over wireless networks with limited radio resources.
We propose a novel over-the-air second-order federated optimization algorithm to simultaneously reduce the communication rounds and enable low-latency global model aggregation.
arXiv Detail & Related papers (2022-03-29T12:39:23Z) - Over-the-Air Decentralized Federated Learning [28.593149477080605]
We consider decentralized federated learning (FL) over wireless networks, where over-the-air computation (AirComp) is adopted to facilitate the local model consensus in a device-to-device (D2D) communication manner.
We propose an AirComp-based DSGD with gradient tracking and variance reduction (DSGT-VR) algorithm, where both precoding and decoding strategies are developed for D2D communication.
We prove that the proposed algorithm converges linearly and establish the optimality gap for strongly convex and smooth loss functions, taking into account the channel fading and noise.
arXiv Detail & Related papers (2021-06-15T09:42:33Z) - Fast Convergence Algorithm for Analog Federated Learning [30.399830943617772]
We propose an AirComp-based FedSplit algorithm for efficient analog federated learning over wireless channels.
We prove that the proposed algorithm linearly converges to the optimal solutions under the assumption that the objective function is strongly convex and smooth.
Our algorithm is theoretically and experimentally verified to be much more robust to the ill-conditioned problems with faster convergence compared with other benchmark FL algorithms.
arXiv Detail & Related papers (2020-10-30T10:59:49Z) - FedPD: A Federated Learning Framework with Optimal Rates and Adaptivity
to Non-IID Data [59.50904660420082]
Federated Learning (FL) has become a popular paradigm for learning from distributed data.
To effectively utilize data at different devices without moving them to the cloud, algorithms such as the Federated Averaging (FedAvg) have adopted a "computation then aggregation" (CTA) model.
arXiv Detail & Related papers (2020-05-22T23:07:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.