A flexible framework for communication-efficient machine learning: from
HPC to IoT
- URL: http://arxiv.org/abs/2003.06377v2
- Date: Wed, 17 Jun 2020 07:58:19 GMT
- Title: A flexible framework for communication-efficient machine learning: from
HPC to IoT
- Authors: Sarit Khirirat, Sindri Magn\'usson, Arda Aytekin, Mikael Johansson
- Abstract summary: Communication-efficiency is now needed in a variety of different system architectures.
We propose a flexible framework which adapts the compression level to the true gradient at each iteration.
Our framework is easy to adapt from one technology to the next by modeling how the communication cost depends on the compression level for the specific technology.
- Score: 13.300503079779952
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the increasing scale of machine learning tasks, it has become essential
to reduce the communication between computing nodes. Early work on gradient
compression focused on the bottleneck between CPUs and GPUs, but
communication-efficiency is now needed in a variety of different system
architectures, from high-performance clusters to energy-constrained IoT
devices. In the current practice, compression levels are typically chosen
before training and settings that work well for one task may be vastly
suboptimal for another dataset on another architecture. In this paper, we
propose a flexible framework which adapts the compression level to the true
gradient at each iteration, maximizing the improvement in the objective
function that is achieved per communicated bit. Our framework is easy to adapt
from one technology to the next by modeling how the communication cost depends
on the compression level for the specific technology. Theoretical results and
practical experiments indicate that the automatic tuning strategies
significantly increase communication efficiency on several state-of-the-art
compression schemes.
Related papers
- AsCAN: Asymmetric Convolution-Attention Networks for Efficient Recognition and Generation [48.82264764771652]
We introduce AsCAN -- a hybrid architecture, combining both convolutional and transformer blocks.
AsCAN supports a variety of tasks: recognition, segmentation, class-conditional image generation.
We then scale the same architecture to solve a large-scale text-to-image task and show state-of-the-art performance.
arXiv Detail & Related papers (2024-11-07T18:43:17Z) - Communication-Efficient Federated Learning through Adaptive Weight
Clustering and Server-Side Distillation [10.541541376305245]
Federated Learning (FL) is a promising technique for the collaborative training of deep neural networks across multiple devices.
FL is hindered by excessive communication costs due to repeated server-client communication during training.
We propose FedCompress, a novel approach that combines dynamic weight clustering and server-side knowledge distillation.
arXiv Detail & Related papers (2024-01-25T14:49:15Z) - Federated learning compression designed for lightweight communications [0.0]
Federated Learning (FL) is a promising distributed machine learning method for edge-level machine learning.
In this paper, we investigate the impact of compression techniques on FL for a typical image classification task.
arXiv Detail & Related papers (2023-10-23T08:36:21Z) - A Machine Learning Framework for Distributed Functional Compression over
Wireless Channels in IoT [13.385373310554327]
IoT devices generate enormous data and state-of-the-art machine learning techniques together will revolutionize cyber-physical systems.
Traditional cloud-based methods that focus on transferring data to a central location either for training or inference place enormous strain on network resources.
We develop, to the best of our knowledge, the first machine learning framework for distributed functional compression over both the Gaussian Multiple Access Channel (GMAC) and AWGN channels.
arXiv Detail & Related papers (2022-01-24T06:38:39Z) - Communication-Compressed Adaptive Gradient Method for Distributed
Nonconvex Optimization [21.81192774458227]
One of the major bottlenecks is the large communication cost between the central server and the local workers.
Our proposed distributed learning framework features an effective gradient gradient compression strategy.
arXiv Detail & Related papers (2021-11-01T04:54:55Z) - ProgFed: Effective, Communication, and Computation Efficient Federated Learning by Progressive Training [65.68511423300812]
We propose ProgFed, a progressive training framework for efficient and effective federated learning.
ProgFed inherently reduces computation and two-way communication costs while maintaining the strong performance of the final models.
Our results show that ProgFed converges at the same rate as standard training on full models.
arXiv Detail & Related papers (2021-10-11T14:45:00Z) - CosSGD: Nonlinear Quantization for Communication-efficient Federated
Learning [62.65937719264881]
Federated learning facilitates learning across clients without transferring local data on these clients to a central server.
We propose a nonlinear quantization for compressed gradient descent, which can be easily utilized in federated learning.
Our system significantly reduces the communication cost by up to three orders of magnitude, while maintaining convergence and accuracy of the training process.
arXiv Detail & Related papers (2020-12-15T12:20:28Z) - A Linearly Convergent Algorithm for Decentralized Optimization: Sending
Less Bits for Free! [72.31332210635524]
Decentralized optimization methods enable on-device training of machine learning models without a central coordinator.
We propose a new randomized first-order method which tackles the communication bottleneck by applying randomized compression operators.
We prove that our method can solve the problems without any increase in the number of communications compared to the baseline.
arXiv Detail & Related papers (2020-11-03T13:35:53Z) - PowerGossip: Practical Low-Rank Communication Compression in
Decentralized Deep Learning [62.440827696638664]
We introduce a simple algorithm that directly compresses the model differences between neighboring workers.
Inspired by the PowerSGD for centralized deep learning, this algorithm uses power steps to maximize the information transferred per bit.
arXiv Detail & Related papers (2020-08-04T09:14:52Z) - Ternary Compression for Communication-Efficient Federated Learning [17.97683428517896]
Federated learning provides a potential solution to privacy-preserving and secure machine learning.
We propose a ternary federated averaging protocol (T-FedAvg) to reduce the upstream and downstream communication of federated learning systems.
Our results show that the proposed T-FedAvg is effective in reducing communication costs and can even achieve slightly better performance on non-IID data.
arXiv Detail & Related papers (2020-03-07T11:55:34Z) - Structured Sparsification with Joint Optimization of Group Convolution
and Channel Shuffle [117.95823660228537]
We propose a novel structured sparsification method for efficient network compression.
The proposed method automatically induces structured sparsity on the convolutional weights.
We also address the problem of inter-group communication with a learnable channel shuffle mechanism.
arXiv Detail & Related papers (2020-02-19T12:03:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.