Adaptive Compression-Aware Split Learning and Inference for Enhanced
Network Efficiency
- URL: http://arxiv.org/abs/2311.05739v4
- Date: Thu, 1 Feb 2024 13:01:25 GMT
- Title: Adaptive Compression-Aware Split Learning and Inference for Enhanced
Network Efficiency
- Authors: Akrit Mudvari, Antero Vainio, Iason Ofeidis, Sasu Tarkoma, Leandros
Tassiulas
- Abstract summary: We develop an adaptive compression-aware split learning method ('deprune') to improve and train deep learning models.
We show that the 'deprune' method can reduce network usage by 4x when compared with a split-learning approach.
We also show that the 'prune' method can reduce the training time for certain models by up to 6x without affecting the accuracy.
- Score: 8.863196307297692
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The growing number of AI-driven applications in mobile devices has led to
solutions that integrate deep learning models with the available edge-cloud
resources. Due to multiple benefits such as reduction in on-device energy
consumption, improved latency, improved network usage, and certain privacy
improvements, split learning, where deep learning models are split away from
the mobile device and computed in a distributed manner, has become an
extensively explored topic. Incorporating compression-aware methods (where
learning adapts to compression level of the communicated data) has made split
learning even more advantageous. This method could even offer a viable
alternative to traditional methods, such as federated learning techniques. In
this work, we develop an adaptive compression-aware split learning method
('deprune') to improve and train deep learning models so that they are much
more network-efficient, which would make them ideal to deploy in weaker devices
with the help of edge-cloud resources. This method is also extended ('prune')
to very quickly train deep learning models through a transfer learning
approach, which trades off little accuracy for much more network-efficient
inference abilities. We show that the 'deprune' method can reduce network usage
by 4x when compared with a split-learning approach (that does not use our
method) without loss of accuracy, while also improving accuracy over
compression-aware split-learning by 4 percent. Lastly, we show that the 'prune'
method can reduce the training time for certain models by up to 6x without
affecting the accuracy when compared against a compression-aware split-learning
approach.
Related papers
- Center-Sensitive Kernel Optimization for Efficient On-Device Incremental Learning [88.78080749909665]
Current on-device training methods just focus on efficient training without considering the catastrophic forgetting.
This paper proposes a simple but effective edge-friendly incremental learning framework.
Our method achieves average accuracy boost of 38.08% with even less memory and approximate computation.
arXiv Detail & Related papers (2024-06-13T05:49:29Z) - Efficient and Effective Augmentation Strategy for Adversarial Training [48.735220353660324]
Adversarial training of Deep Neural Networks is known to be significantly more data-hungry than standard training.
We propose Diverse Augmentation-based Joint Adversarial Training (DAJAT) to use data augmentations effectively in adversarial training.
arXiv Detail & Related papers (2022-10-27T10:59:55Z) - ProgFed: Effective, Communication, and Computation Efficient Federated Learning by Progressive Training [65.68511423300812]
We propose ProgFed, a progressive training framework for efficient and effective federated learning.
ProgFed inherently reduces computation and two-way communication costs while maintaining the strong performance of the final models.
Our results show that ProgFed converges at the same rate as standard training on full models.
arXiv Detail & Related papers (2021-10-11T14:45:00Z) - LCS: Learning Compressible Subspaces for Adaptive Network Compression at
Inference Time [57.52251547365967]
We propose a method for training a "compressible subspace" of neural networks that contains a fine-grained spectrum of models.
We present results for achieving arbitrarily fine-grained accuracy-efficiency trade-offs at inference time for structured and unstructured sparsity.
Our algorithm extends to quantization at variable bit widths, achieving accuracy on par with individually trained networks.
arXiv Detail & Related papers (2021-10-08T17:03:34Z) - Efficient and Private Federated Learning with Partially Trainable
Networks [8.813191488656527]
We propose to leverage partially trainable neural networks, which freeze a portion of the model parameters during the entire training process.
We empirically show that Federated learning of Partially Trainable neural networks (FedPT) can result in superior communication-accuracy trade-offs.
Our approach also enables faster training, with a smaller memory footprint, and better utility for strong differential privacy guarantees.
arXiv Detail & Related papers (2021-10-06T04:28:33Z) - Toward Communication Efficient Adaptive Gradient Method [29.02154169980269]
In recent years, distributed optimization is proven to be an effective approach to accelerate training of large scale machine learning models such as deep neural networks.
In the hope of training machine learning models on mobile devices, a new distributed training paradigm called federated learning'' has become popular.
We propose an adaptive gradient method that can guarantee both the convergence and the communication efficiency for federated learning.
arXiv Detail & Related papers (2021-09-10T21:14:36Z) - Dynamic Sparse Training for Deep Reinforcement Learning [36.66889208433228]
We propose for the first time to dynamically train deep reinforcement learning agents with sparse neural networks from scratch.
Our approach is easy to be integrated into existing deep reinforcement learning algorithms.
We evaluate our approach on OpenAI gym continuous control tasks.
arXiv Detail & Related papers (2021-06-08T09:57:20Z) - Efficient Distributed Auto-Differentiation [22.192220404846267]
gradient-based algorithms for training large deep neural networks (DNNs) are communication-heavy.
We introduce a surprisingly simple statistic for training distributed DNNs that is more communication-friendly than the gradient.
The process provides the flexibility of averaging gradients during backpropagation, enabling novel flexible training schemas.
arXiv Detail & Related papers (2021-02-18T21:46:27Z) - Sparsity in Deep Learning: Pruning and growth for efficient inference
and training in neural networks [78.47459801017959]
Sparsity can reduce the memory footprint of regular networks to fit mobile devices.
We describe approaches to remove and add elements of neural networks, different training strategies to achieve model sparsity, and mechanisms to exploit sparsity in practice.
arXiv Detail & Related papers (2021-01-31T22:48:50Z) - PowerGossip: Practical Low-Rank Communication Compression in
Decentralized Deep Learning [62.440827696638664]
We introduce a simple algorithm that directly compresses the model differences between neighboring workers.
Inspired by the PowerSGD for centralized deep learning, this algorithm uses power steps to maximize the information transferred per bit.
arXiv Detail & Related papers (2020-08-04T09:14:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.