Communication-efficient Vertical Federated Learning via Compressed Error Feedback
- URL: http://arxiv.org/abs/2406.14420v3
- Date: Sat, 22 Feb 2025 20:52:01 GMT
- Title: Communication-efficient Vertical Federated Learning via Compressed Error Feedback
- Authors: Pedro Valdeira, João Xavier, Cláudia Soares, Yuejie Chi,
- Abstract summary: Lossy compression is commonly used on the information communicated between the server and clients during training.<n>In horizontal FL, each subset of a training subset holds a subset of that subset of information.<n>We propose a training method for vertical FL where each client holds a subset of that subset of information.<n>Our method converges linearly when the objective function satisfies the Polyak-Lojasiewicz inequality.
- Score: 24.32409923443071
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Communication overhead is a known bottleneck in federated learning (FL). To address this, lossy compression is commonly used on the information communicated between the server and clients during training. In horizontal FL, where each client holds a subset of the samples, such communication-compressed training methods have recently seen significant progress. However, in their vertical FL counterparts, where each client holds a subset of the features, our understanding remains limited. To address this, we propose an error feedback compressed vertical federated learning (EF-VFL) method to train split neural networks. In contrast to previous communication-compressed methods for vertical FL, EF-VFL does not require a vanishing compression error for the gradient norm to converge to zero for smooth nonconvex problems. By leveraging error feedback, our method can achieve a $\mathcal{O}(1/T)$ convergence rate for a sufficiently large batch size, improving over the state-of-the-art $\mathcal{O}(1/\sqrt{T})$ rate under $\mathcal{O}(1/\sqrt{T})$ compression error, and matching the rate of uncompressed methods. Further, when the objective function satisfies the Polyak-{\L}ojasiewicz inequality, our method converges linearly. In addition to improving convergence, our method also supports the use of private labels. Numerical experiments show that EF-VFL significantly improves over the prior art, confirming our theoretical results. The code for this work can be found at https://github.com/Valdeira/EF-VFL.
Related papers
- Vertical Federated Learning with Missing Features During Training and Inference [37.44022318612869]
We propose a vertical federated learning method for efficient training and inference of neural network-based models.
We show that our method achieves linear convergence to a neighborhood of the optimum even in the absence of missing features.
arXiv Detail & Related papers (2024-10-29T22:09:31Z) - Communication and Energy Efficient Federated Learning using Zero-Order Optimization Technique [14.986031916712108]
Federated learning (FL) is a popular machine learning technique that enables multiple users to collaboratively train a model while maintaining the user data privacy.
A significant challenge in FL is the communication bottleneck in the upload direction, and thus the corresponding energy consumption of the devices.
We show the superiority of our method, in terms of communication overhead and energy, as compared to standard gradient-based FL methods.
arXiv Detail & Related papers (2024-09-24T20:57:22Z) - Communication Efficient ConFederated Learning: An Event-Triggered SAGA
Approach [67.27031215756121]
Federated learning (FL) is a machine learning paradigm that targets model training without gathering the local data over various data sources.
Standard FL, which employs a single server, can only support a limited number of users, leading to degraded learning capability.
In this work, we consider a multi-server FL framework, referred to as emphConfederated Learning (CFL) in order to accommodate a larger number of users.
arXiv Detail & Related papers (2024-02-28T03:27:10Z) - Fed-CVLC: Compressing Federated Learning Communications with
Variable-Length Codes [54.18186259484828]
In Federated Learning (FL) paradigm, a parameter server (PS) concurrently communicates with distributed participating clients for model collection, update aggregation, and model distribution over multiple rounds.
We show strong evidences that variable-length is beneficial for compression in FL.
We present Fed-CVLC (Federated Learning Compression with Variable-Length Codes), which fine-tunes the code length in response to the dynamics of model updates.
arXiv Detail & Related papers (2024-02-06T07:25:21Z) - Communication-Efficient Vertical Federated Learning with Limited
Overlapping Samples [34.576230628844506]
We propose a vertical federated learning (VFL) framework called textbfone-shot VFL.
In our proposed framework, the clients only need to communicate with the server once or only a few times.
Our methods can improve the accuracy by more than 46.5% and reduce the communication cost by more than 330$times$ compared with state-of-the-art VFL methods.
arXiv Detail & Related papers (2023-03-28T19:30:23Z) - Improving Representational Continuity via Continued Pretraining [76.29171039601948]
Transfer learning community (LP-FT) outperforms naive training and other continual learning methods.
LP-FT also reduces forgetting in a real world satellite remote sensing dataset (FMoW)
variant of LP-FT gets state-of-the-art accuracies on an NLP continual learning benchmark.
arXiv Detail & Related papers (2023-02-26T10:39:38Z) - FedDA: Faster Framework of Local Adaptive Gradient Methods via Restarted
Dual Averaging [104.41634756395545]
Federated learning (FL) is an emerging learning paradigm to tackle massively distributed data.
We propose textbfFedDA, a novel framework for local adaptive gradient methods.
We show that textbfFedDA-MVR is the first adaptive FL algorithm that achieves this rate.
arXiv Detail & Related papers (2023-02-13T05:10:30Z) - Improving the Model Consistency of Decentralized Federated Learning [68.2795379609854]
Federated Learning (FL) discards the central server and each client only communicates with its neighbors in a decentralized communication network.
Existing DFL suffers from inconsistency among local clients, which results in inferior compared to FLFL.
We propose DFedSAMMGS, where $1lambda$ is the spectral gossip matrix and $Q$ is the number of sparse data gaps.
arXiv Detail & Related papers (2023-02-08T14:37:34Z) - TCT: Convexifying Federated Learning using Bootstrapped Neural Tangent
Kernels [141.29156234353133]
State-of-the-art convex learning methods can perform far worse than their centralized counterparts when clients have dissimilar data distributions.
We show this disparity can largely be attributed to challenges presented by non-NISTity.
We propose a Train-Convexify neural network (TCT) procedure to sidestep this issue.
arXiv Detail & Related papers (2022-07-13T16:58:22Z) - Compressed-VFL: Communication-Efficient Learning with Vertically
Partitioned Data [15.85259386116784]
We propose Compressed Vertical Learning (C-VFL) for communication training on vertically partitioned data.
We show experimentally that VFL can reduce communication by over $90%$ without a significant decrease of compression accuracy.
arXiv Detail & Related papers (2022-06-16T17:34:07Z) - ProgFed: Effective, Communication, and Computation Efficient Federated Learning by Progressive Training [65.68511423300812]
We propose ProgFed, a progressive training framework for efficient and effective federated learning.
ProgFed inherently reduces computation and two-way communication costs while maintaining the strong performance of the final models.
Our results show that ProgFed converges at the same rate as standard training on full models.
arXiv Detail & Related papers (2021-10-11T14:45:00Z) - FLASHE: Additively Symmetric Homomorphic Encryption for Cross-Silo
Federated Learning [9.177048551836897]
Homomorphic encryption (HE) is a promising privacy-preserving technique for cross-silo federated learning (FL)
Homomorphic encryption (HE) is a promising privacy-preserving technique for cross-silo federated learning (FL)
arXiv Detail & Related papers (2021-09-02T02:36:04Z) - CFedAvg: Achieving Efficient Communication and Fast Convergence in
Non-IID Federated Learning [8.702106020664612]
Federated learning (FL) is a prevailing distributed learning paradigm, where a large number of workers jointly learn a model without sharing their training data.
High communication costs could arise in FL due to deep-scale (deep) learning models and bandwidth-connected connections.
We introduce a distributed communication datasets called CFedAvg for FL with non-biased SNR-constrained compressors.
arXiv Detail & Related papers (2021-06-14T04:27:19Z) - Compressed Communication for Distributed Training: Adaptive Methods and
System [13.244482588437972]
Communication overhead severely hinders the scalability of distributed machine learning systems.
Recently, there has been a growing interest in using gradient compression to reduce the communication overhead.
In this paper, we first introduce a novel adaptive gradient method with gradient compression.
arXiv Detail & Related papers (2021-05-17T13:41:47Z) - Faster Non-Convex Federated Learning via Global and Local Momentum [57.52663209739171]
textttFedGLOMO is the first (first-order) FLtexttFedGLOMO algorithm.
Our algorithm is provably optimal even with communication between the clients and the server.
arXiv Detail & Related papers (2020-12-07T21:05:31Z) - Over-the-Air Federated Learning from Heterogeneous Data [107.05618009955094]
Federated learning (FL) is a framework for distributed learning of centralized models.
We develop a Convergent OTA FL (COTAF) algorithm which enhances the common local gradient descent (SGD) FL algorithm.
We numerically show that the precoding induced by COTAF notably improves the convergence rate and the accuracy of models trained via OTA FL.
arXiv Detail & Related papers (2020-09-27T08:28:25Z) - VAFL: a Method of Vertical Asynchronous Federated Learning [40.423372614317195]
Horizontal learning (FL) handles multi-client data that share the same set of features.
vertical FL trains a better predictor that combine perturbed from different clients.
arXiv Detail & Related papers (2020-07-12T20:09:25Z) - On Biased Compression for Distributed Learning [55.89300593805943]
We show for the first time that biased compressors can lead to linear convergence rates both in the single node and distributed settings.
We propose several new biased compressors with promising theoretical guarantees and practical performance.
arXiv Detail & Related papers (2020-02-27T19:52:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.