Related papers: Supervised Compression for Resource-constrained Edge Computing Systems

Supervised Compression for Resource-constrained Edge Computing Systems

URL: http://arxiv.org/abs/2108.11898v1
Date: Sat, 21 Aug 2021 11:10:29 GMT
Title: Supervised Compression for Resource-constrained Edge Computing Systems
Authors: Yoshitomo Matsubara, Ruihan Yang, Marco Levorato, Stephan Mandt
Abstract summary: Full-scale deep neural networks are often too resource-intensive in terms of energy and storage. This paper adopts ideas from knowledge distillation and neural image compression to compress intermediate feature representations more efficiently. It achieves better supervised rate-distortion performance while also maintaining smaller end-to-end latency.
Score: 26.676557573171618
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: There has been much interest in deploying deep learning algorithms on low-powered devices, including smartphones, drones, and medical sensors. However, full-scale deep neural networks are often too resource-intensive in terms of energy and storage. As a result, the bulk part of the machine learning operation is therefore often carried out on an edge server, where the data is compressed and transmitted. However, compressing data (such as images) leads to transmitting information irrelevant to the supervised task. Another popular approach is to split the deep network between the device and the server while compressing intermediate features. To date, however, such split computing strategies have barely outperformed the aforementioned naive data compression baselines due to their inefficient approaches to feature compression. This paper adopts ideas from knowledge distillation and neural image compression to compress intermediate feature representations more efficiently. Our supervised compression approach uses a teacher model and a student model with a stochastic bottleneck and learnable prior for entropy coding. We compare our approach to various neural image and feature compression baselines in three vision tasks and found that it achieves better supervised rate-distortion performance while also maintaining smaller end-to-end latency. We furthermore show that the learned feature representations can be tuned to serve multiple downstream tasks.

Related papers

Efficient Token Compression for Vision Transformer with Spatial Information Preserved [59.79302182800274]
Token compression is essential for reducing the computational and memory requirements of transformer models. We propose an efficient and hardware-compatible token compression method called Prune and Merge.
arXiv Detail & Related papers (2025-03-30T14:23:18Z)
UniCompress: Enhancing Multi-Data Medical Image Compression with Knowledge Distillation [59.3877309501938]
Implicit Neural Representation (INR) networks have shown remarkable versatility due to their flexible compression ratios. We introduce a codebook containing frequency domain information as a prior input to the INR network. This enhances the representational power of INR and provides distinctive conditioning for different image blocks.
arXiv Detail & Related papers (2024-05-27T05:52:13Z)
Streaming Lossless Volumetric Compression of Medical Images Using Gated Recurrent Convolutional Neural Network [0.0]
This paper introduces a hardware-friendly streaming lossless volumetric compression framework. We propose a gated recurrent convolutional neural network that combines diverse convolutional structures and fusion gate mechanisms. Our method exhibits robust generalization ability and competitive compression speed.
arXiv Detail & Related papers (2023-11-27T07:19:09Z)
Towards Hardware-Specific Automatic Compression of Neural Networks [0.0]
pruning and quantization are the major approaches to compress neural networks nowadays. Effective compression policies consider the influence of the specific hardware architecture on the used compression methods. We propose an algorithmic framework called Galen to search such policies using reinforcement learning utilizing pruning and quantization.
arXiv Detail & Related papers (2022-12-15T13:34:02Z)
Crowd Counting on Heavily Compressed Images with Curriculum Pre-Training [90.76576712433595]
Applying lossy compression on images processed by deep neural networks can lead to significant accuracy degradation. Inspired by the curriculum learning paradigm, we present a novel training approach called curriculum pre-training (CPT) for crowd counting on compressed images.
arXiv Detail & Related papers (2022-08-15T08:43:21Z)
COIN++: Data Agnostic Neural Compression [55.27113889737545]
COIN++ is a neural compression framework that seamlessly handles a wide range of data modalities. We demonstrate the effectiveness of our method by compressing various data modalities.
arXiv Detail & Related papers (2022-01-30T20:12:04Z)
On Effects of Compression with Hyperdimensional Computing in Distributed Randomized Neural Networks [6.25118865553438]
We propose a model for distributed classification based on randomized neural networks and hyperdimensional computing. In this work, we propose a more flexible approach to compression and compare it to conventional compression algorithms, dimensionality reduction, and quantization techniques.
arXiv Detail & Related papers (2021-06-17T22:02:40Z)
Analyzing and Mitigating JPEG Compression Defects in Deep Learning [69.04777875711646]
We present a unified study of the effects of JPEG compression on a range of common tasks and datasets. We show that there is a significant penalty on common performance metrics for high compression.
arXiv Detail & Related papers (2020-11-17T20:32:57Z)
Permute, Quantize, and Fine-tune: Efficient Compression of Neural Networks [70.0243910593064]
Key to success of vector quantization is deciding which parameter groups should be compressed together. In this paper we make the observation that the weights of two adjacent layers can be permuted while expressing the same function. We then establish a connection to rate-distortion theory and search for permutations that result in networks that are easier to compress.
arXiv Detail & Related papers (2020-10-29T15:47:26Z)
PowerGossip: Practical Low-Rank Communication Compression in Decentralized Deep Learning [62.440827696638664]
We introduce a simple algorithm that directly compresses the model differences between neighboring workers. Inspired by the PowerSGD for centralized deep learning, this algorithm uses power steps to maximize the information transferred per bit.
arXiv Detail & Related papers (2020-08-04T09:14:52Z)
Distributed Learning and Inference with Compressed Images [40.07509530656681]
This paper focuses on vision-based perception for autonomous driving as a paradigmatic scenario. We propose dataset restoration, based on image restoration with generative adversarial networks (GANs) Our method is agnostic to both the particular image compression method and the downstream task.
arXiv Detail & Related papers (2020-04-22T11:20:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.