Progressive Neural Compression for Adaptive Image Offloading under
Timing Constraints
- URL: http://arxiv.org/abs/2310.05306v1
- Date: Sun, 8 Oct 2023 22:58:31 GMT
- Title: Progressive Neural Compression for Adaptive Image Offloading under
Timing Constraints
- Authors: Ruiqi Wang, Hanyang Liu, Jiaming Qiu, Moran Xu, Roch Guerin, Chenyang
Lu
- Abstract summary: It is important to develop an adaptive approach that maximizes the inference performance of machine learning applications under timing constraints.
In this paper, we use image classification as our target application and propose progressive neural compression (PNC) as an efficient solution to this problem.
We demonstrate the benefits of PNC over state-of-the-art neural compression approaches and traditional compression methods on a testbed.
- Score: 9.903309560890317
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: IoT devices are increasingly the source of data for machine learning (ML)
applications running on edge servers. Data transmissions from devices to
servers are often over local wireless networks whose bandwidth is not just
limited but, more importantly, variable. Furthermore, in cyber-physical systems
interacting with the physical environment, image offloading is also commonly
subject to timing constraints. It is, therefore, important to develop an
adaptive approach that maximizes the inference performance of ML applications
under timing constraints and the resource constraints of IoT devices. In this
paper, we use image classification as our target application and propose
progressive neural compression (PNC) as an efficient solution to this problem.
Although neural compression has been used to compress images for different ML
applications, existing solutions often produce fixed-size outputs that are
unsuitable for timing-constrained offloading over variable bandwidth. To
address this limitation, we train a multi-objective rateless autoencoder that
optimizes for multiple compression rates via stochastic taildrop to create a
compression solution that produces features ordered according to their
importance to inference performance. Features are then transmitted in that
order based on available bandwidth, with classification ultimately performed
using the (sub)set of features received by the deadline. We demonstrate the
benefits of PNC over state-of-the-art neural compression approaches and
traditional compression methods on a testbed comprising an IoT device and an
edge server connected over a wireless network with varying bandwidth.
Related papers
- UniCompress: Enhancing Multi-Data Medical Image Compression with Knowledge Distillation [59.3877309501938]
Implicit Neural Representation (INR) networks have shown remarkable versatility due to their flexible compression ratios.
We introduce a codebook containing frequency domain information as a prior input to the INR network.
This enhances the representational power of INR and provides distinctive conditioning for different image blocks.
arXiv Detail & Related papers (2024-05-27T05:52:13Z) - AdaBM: On-the-Fly Adaptive Bit Mapping for Image Super-Resolution [53.23803932357899]
We introduce the first on-the-fly adaptive quantization framework that accelerates the processing time from hours to seconds.
We achieve competitive performance with the previous adaptive quantization methods, while the processing time is accelerated by x2000.
arXiv Detail & Related papers (2024-04-04T08:37:27Z) - Federated learning compression designed for lightweight communications [0.0]
Federated Learning (FL) is a promising distributed machine learning method for edge-level machine learning.
In this paper, we investigate the impact of compression techniques on FL for a typical image classification task.
arXiv Detail & Related papers (2023-10-23T08:36:21Z) - Bandwidth-efficient Inference for Neural Image Compression [26.87198174202502]
We propose an end-to-end differentiable bandwidth efficient neural inference method with the activation compressed by neural data compression method.
Optimized with existing model quantization methods, low-level task of image compression can achieve up to 19x bandwidth reduction with 6.21x energy saving.
arXiv Detail & Related papers (2023-09-06T09:31:37Z) - FrankenSplit: Efficient Neural Feature Compression with Shallow Variational Bottleneck Injection for Mobile Edge Computing [5.815300670677979]
We introduce a novel framework for resource-conscious compression models and extensively evaluate our method in an asymmetric environment.
Our method achieves 60% lower than a state-of-the-art SC method without decreasing accuracy and is up 16x faster than offloading with existing standards.
arXiv Detail & Related papers (2023-02-21T14:03:22Z) - Analysis of the Effect of Low-Overhead Lossy Image Compression on the
Performance of Visual Crowd Counting for Smart City Applications [78.55896581882595]
Lossy image compression techniques can reduce the quality of the images, leading to accuracy degradation.
In this paper, we analyze the effect of applying low-overhead lossy image compression methods on the accuracy of visual crowd counting.
arXiv Detail & Related papers (2022-07-20T19:20:03Z) - An Adaptive Device-Edge Co-Inference Framework Based on Soft
Actor-Critic [72.35307086274912]
High-dimension parameter model and large-scale mathematical calculation restrict execution efficiency, especially for Internet of Things (IoT) devices.
We propose a new Deep Reinforcement Learning (DRL)-Soft Actor Critic for discrete (SAC-d), which generates the emphexit point, emphexit point, and emphcompressing bits by soft policy iterations.
Based on the latency and accuracy aware reward design, such an computation can well adapt to the complex environment like dynamic wireless channel and arbitrary processing, and is capable of supporting the 5G URL
arXiv Detail & Related papers (2022-01-09T09:31:50Z) - Implicit Neural Representations for Image Compression [103.78615661013623]
Implicit Neural Representations (INRs) have gained attention as a novel and effective representation for various data types.
We propose the first comprehensive compression pipeline based on INRs including quantization, quantization-aware retraining and entropy coding.
We find that our approach to source compression with INRs vastly outperforms similar prior work.
arXiv Detail & Related papers (2021-12-08T13:02:53Z) - COMET: A Novel Memory-Efficient Deep Learning Training Framework by
Using Error-Bounded Lossy Compression [8.080129426746288]
Training wide and deep neural networks (DNNs) require large amounts of storage resources such as memory.
We propose a memory-efficient CNN training framework (called COMET) that leverages error-bounded lossy compression.
Our framework can significantly reduce the training memory consumption by up to 13.5X over the baseline training and 1.8X over another state-of-the-art compression-based framework.
arXiv Detail & Related papers (2021-11-18T07:43:45Z) - ALF: Autoencoder-based Low-rank Filter-sharing for Efficient
Convolutional Neural Networks [63.91384986073851]
We propose the autoencoder-based low-rank filter-sharing technique technique (ALF)
ALF shows a reduction of 70% in network parameters, 61% in operations and 41% in execution time, with minimal loss in accuracy.
arXiv Detail & Related papers (2020-07-27T09:01:22Z) - Dynamic Compression Ratio Selection for Edge Inference Systems with Hard
Deadlines [9.585931043664363]
We propose a dynamic compression ratio selection scheme for edge inference system with hard deadlines.
Information augmentation that retransmits less compressed data of task with erroneous inference is proposed to enhance the accuracy performance.
Considering the wireless transmission errors, we further design a retransmission scheme to reduce performance degradation due to packet losses.
arXiv Detail & Related papers (2020-05-25T17:11:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.