Related papers: Compressed-VFL: Communication-Efficient Learning with Vertically Partitioned Data

Compressed-VFL: Communication-Efficient Learning with Vertically Partitioned Data

URL: http://arxiv.org/abs/2206.08330v2
Date: Tue, 28 Mar 2023 22:03:09 GMT
Title: Compressed-VFL: Communication-Efficient Learning with Vertically Partitioned Data
Authors: Timothy Castiglia, Anirban Das, Shiqiang Wang, Stacy Patterson
Abstract summary: We propose Compressed Vertical Learning (C-VFL) for communication training on vertically partitioned data. We show experimentally that VFL can reduce communication by over $90%$ without a significant decrease of compression accuracy.
Score: 15.85259386116784
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We propose Compressed Vertical Federated Learning (C-VFL) for communication-efficient training on vertically partitioned data. In C-VFL, a server and multiple parties collaboratively train a model on their respective features utilizing several local iterations and sharing compressed intermediate results periodically. Our work provides the first theoretical analysis of the effect message compression has on distributed training over vertically partitioned data. We prove convergence of non-convex objectives at a rate of $O(\frac{1}{\sqrt{T}})$ when the compression error is bounded over the course of training. We provide specific requirements for convergence with common compression techniques, such as quantization and top-$k$ sparsification. Finally, we experimentally show compression can reduce communication by over $90\%$ without a significant decrease in accuracy over VFL without compression.

Related papers

Token Compensator: Altering Inference Cost of Vision Transformer without Re-Tuning [63.43972993473501]
Token compression expedites the training and inference of Vision Transformers (ViTs) However, when applied to downstream tasks, compression degrees are mismatched between training and inference stages. We propose a model arithmetic framework to decouple the compression degrees between the two stages.
arXiv Detail & Related papers (2024-08-13T10:36:43Z)
Communication-efficient Vertical Federated Learning via Compressed Error Feedback [24.32409923443071]
Communication overhead is a known bottleneck in learning (FL) We propose error feedback over federated networks to train networks. EFVFL does not require a vanishing compression error for smooth non-significant problems.
arXiv Detail & Related papers (2024-06-20T15:40:38Z)
Fed-CVLC: Compressing Federated Learning Communications with Variable-Length Codes [54.18186259484828]
In Federated Learning (FL) paradigm, a parameter server (PS) concurrently communicates with distributed participating clients for model collection, update aggregation, and model distribution over multiple rounds. We show strong evidences that variable-length is beneficial for compression in FL. We present Fed-CVLC (Federated Learning Compression with Variable-Length Codes), which fine-tunes the code length in response to the dynamics of model updates.
arXiv Detail & Related papers (2024-02-06T07:25:21Z)
Activations and Gradients Compression for Model-Parallel Training [85.99744701008802]
We study how simultaneous compression of activations and gradients in model-parallel distributed training setup affects convergence. We find that gradients require milder compression rates than activations. Experiments also show that models trained with TopK perform well only when compression is also applied during inference.
arXiv Detail & Related papers (2024-01-15T15:54:54Z)
GraVAC: Adaptive Compression for Communication-Efficient Distributed DL Training [0.0]
Distributed data-parallel (DDP) training improves overall application throughput as multiple devices train on a subset of data and aggregate updates to produce a globally shared model. GraVAC is a framework to dynamically adjust compression factor throughout training by evaluating model progress and assessing information loss associated with compression. As opposed to using a static compression factor, GraVAC reduces end-to-end training time for ResNet101, VGG16 and LSTM by 4.32x, 1.95x and 6.67x respectively.
arXiv Detail & Related papers (2023-05-20T14:25:17Z)
DoCoFL: Downlink Compression for Cross-Device Federated Learning [12.363097878376644]
$textsfDoCoFL$ is a new framework for downlink compression in the cross-device setting. It offers significant bi-directional bandwidth reduction while achieving competitive accuracy to that of a baseline without any compression.
arXiv Detail & Related papers (2023-02-01T16:08:54Z)
Optimus-CC: Efficient Large NLP Model Training with 3D Parallelism Aware Communication Compression [8.591088380355252]
We present Optimus-CC, a fast and scalable distributed training framework for large NLP models with aggressive communication compression. We propose techniques to avoid the model quality drop that comes from the compression. We demonstrate our solution on a GPU cluster, and achieve superior speedup from the baseline state-of-the-art solutions for distributed training without sacrificing the model quality.
arXiv Detail & Related papers (2023-01-24T06:07:55Z)
Optimal Rate Adaption in Federated Learning with Compressed Communications [28.16239232265479]
Federated Learning incurs high communication overhead, which can be greatly alleviated by compression for model updates. tradeoff between compression and model accuracy in the networked environment remains unclear. We present a framework to maximize the final model accuracy by strategically adjusting the compression each iteration.
arXiv Detail & Related papers (2021-12-13T14:26:15Z)
ProgFed: Effective, Communication, and Computation Efficient Federated Learning by Progressive Training [65.68511423300812]
We propose ProgFed, a progressive training framework for efficient and effective federated learning. ProgFed inherently reduces computation and two-way communication costs while maintaining the strong performance of the final models. Our results show that ProgFed converges at the same rate as standard training on full models.
arXiv Detail & Related papers (2021-10-11T14:45:00Z)
Towards Compact CNNs via Collaborative Compression [166.86915086497433]
We propose a Collaborative Compression scheme, which joints channel pruning and tensor decomposition to compress CNN models. We achieve 52.9% FLOPs reduction by removing 48.4% parameters on ResNet-50 with only a Top-1 accuracy drop of 0.56% on ImageNet 2012.
arXiv Detail & Related papers (2021-05-24T12:07:38Z)
Compressed Communication for Distributed Training: Adaptive Methods and System [13.244482588437972]
Communication overhead severely hinders the scalability of distributed machine learning systems. Recently, there has been a growing interest in using gradient compression to reduce the communication overhead. In this paper, we first introduce a novel adaptive gradient method with gradient compression.
arXiv Detail & Related papers (2021-05-17T13:41:47Z)
PowerGossip: Practical Low-Rank Communication Compression in Decentralized Deep Learning [62.440827696638664]
We introduce a simple algorithm that directly compresses the model differences between neighboring workers. Inspired by the PowerSGD for centralized deep learning, this algorithm uses power steps to maximize the information transferred per bit.
arXiv Detail & Related papers (2020-08-04T09:14:52Z)
On Biased Compression for Distributed Learning [55.89300593805943]
We show for the first time that biased compressors can lead to linear convergence rates both in the single node and distributed settings. We propose several new biased compressors with promising theoretical guarantees and practical performance.
arXiv Detail & Related papers (2020-02-27T19:52:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.