Supervised Compression for Resource-constrained Edge Computing Systems
- URL: http://arxiv.org/abs/2108.11898v1
- Date: Sat, 21 Aug 2021 11:10:29 GMT
- Title: Supervised Compression for Resource-constrained Edge Computing Systems
- Authors: Yoshitomo Matsubara, Ruihan Yang, Marco Levorato, Stephan Mandt
- Abstract summary: Full-scale deep neural networks are often too resource-intensive in terms of energy and storage.
This paper adopts ideas from knowledge distillation and neural image compression to compress intermediate feature representations more efficiently.
It achieves better supervised rate-distortion performance while also maintaining smaller end-to-end latency.
- Score: 26.676557573171618
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: There has been much interest in deploying deep learning algorithms on
low-powered devices, including smartphones, drones, and medical sensors.
However, full-scale deep neural networks are often too resource-intensive in
terms of energy and storage. As a result, the bulk part of the machine learning
operation is therefore often carried out on an edge server, where the data is
compressed and transmitted. However, compressing data (such as images) leads to
transmitting information irrelevant to the supervised task. Another popular
approach is to split the deep network between the device and the server while
compressing intermediate features. To date, however, such split computing
strategies have barely outperformed the aforementioned naive data compression
baselines due to their inefficient approaches to feature compression. This
paper adopts ideas from knowledge distillation and neural image compression to
compress intermediate feature representations more efficiently. Our supervised
compression approach uses a teacher model and a student model with a stochastic
bottleneck and learnable prior for entropy coding. We compare our approach to
various neural image and feature compression baselines in three vision tasks
and found that it achieves better supervised rate-distortion performance while
also maintaining smaller end-to-end latency. We furthermore show that the
learned feature representations can be tuned to serve multiple downstream
tasks.
Related papers
- UniCompress: Enhancing Multi-Data Medical Image Compression with Knowledge Distillation [59.3877309501938]
Implicit Neural Representation (INR) networks have shown remarkable versatility due to their flexible compression ratios.
We introduce a codebook containing frequency domain information as a prior input to the INR network.
This enhances the representational power of INR and provides distinctive conditioning for different image blocks.
arXiv Detail & Related papers (2024-05-27T05:52:13Z) - Streaming Lossless Volumetric Compression of Medical Images Using Gated
Recurrent Convolutional Neural Network [0.0]
This paper introduces a hardware-friendly streaming lossless volumetric compression framework.
We propose a gated recurrent convolutional neural network that combines diverse convolutional structures and fusion gate mechanisms.
Our method exhibits robust generalization ability and competitive compression speed.
arXiv Detail & Related papers (2023-11-27T07:19:09Z) - Towards Hardware-Specific Automatic Compression of Neural Networks [0.0]
pruning and quantization are the major approaches to compress neural networks nowadays.
Effective compression policies consider the influence of the specific hardware architecture on the used compression methods.
We propose an algorithmic framework called Galen to search such policies using reinforcement learning utilizing pruning and quantization.
arXiv Detail & Related papers (2022-12-15T13:34:02Z) - Crowd Counting on Heavily Compressed Images with Curriculum Pre-Training [90.76576712433595]
Applying lossy compression on images processed by deep neural networks can lead to significant accuracy degradation.
Inspired by the curriculum learning paradigm, we present a novel training approach called curriculum pre-training (CPT) for crowd counting on compressed images.
arXiv Detail & Related papers (2022-08-15T08:43:21Z) - COIN++: Data Agnostic Neural Compression [55.27113889737545]
COIN++ is a neural compression framework that seamlessly handles a wide range of data modalities.
We demonstrate the effectiveness of our method by compressing various data modalities.
arXiv Detail & Related papers (2022-01-30T20:12:04Z) - On Effects of Compression with Hyperdimensional Computing in Distributed
Randomized Neural Networks [6.25118865553438]
We propose a model for distributed classification based on randomized neural networks and hyperdimensional computing.
In this work, we propose a more flexible approach to compression and compare it to conventional compression algorithms, dimensionality reduction, and quantization techniques.
arXiv Detail & Related papers (2021-06-17T22:02:40Z) - Analyzing and Mitigating JPEG Compression Defects in Deep Learning [69.04777875711646]
We present a unified study of the effects of JPEG compression on a range of common tasks and datasets.
We show that there is a significant penalty on common performance metrics for high compression.
arXiv Detail & Related papers (2020-11-17T20:32:57Z) - Permute, Quantize, and Fine-tune: Efficient Compression of Neural
Networks [70.0243910593064]
Key to success of vector quantization is deciding which parameter groups should be compressed together.
In this paper we make the observation that the weights of two adjacent layers can be permuted while expressing the same function.
We then establish a connection to rate-distortion theory and search for permutations that result in networks that are easier to compress.
arXiv Detail & Related papers (2020-10-29T15:47:26Z) - PowerGossip: Practical Low-Rank Communication Compression in
Decentralized Deep Learning [62.440827696638664]
We introduce a simple algorithm that directly compresses the model differences between neighboring workers.
Inspired by the PowerSGD for centralized deep learning, this algorithm uses power steps to maximize the information transferred per bit.
arXiv Detail & Related papers (2020-08-04T09:14:52Z) - Distributed Learning and Inference with Compressed Images [40.07509530656681]
This paper focuses on vision-based perception for autonomous driving as a paradigmatic scenario.
We propose dataset restoration, based on image restoration with generative adversarial networks (GANs)
Our method is agnostic to both the particular image compression method and the downstream task.
arXiv Detail & Related papers (2020-04-22T11:20:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.