Related papers: Progressive Neural Compression for Adaptive Image Offloading under Timing Constraints

Progressive Neural Compression for Adaptive Image Offloading under Timing Constraints

URL: http://arxiv.org/abs/2310.05306v1
Date: Sun, 8 Oct 2023 22:58:31 GMT
Title: Progressive Neural Compression for Adaptive Image Offloading under Timing Constraints
Authors: Ruiqi Wang, Hanyang Liu, Jiaming Qiu, Moran Xu, Roch Guerin, Chenyang Lu
Abstract summary: It is important to develop an adaptive approach that maximizes the inference performance of machine learning applications under timing constraints. In this paper, we use image classification as our target application and propose progressive neural compression (PNC) as an efficient solution to this problem. We demonstrate the benefits of PNC over state-of-the-art neural compression approaches and traditional compression methods on a testbed.
Score: 9.903309560890317
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: IoT devices are increasingly the source of data for machine learning (ML) applications running on edge servers. Data transmissions from devices to servers are often over local wireless networks whose bandwidth is not just limited but, more importantly, variable. Furthermore, in cyber-physical systems interacting with the physical environment, image offloading is also commonly subject to timing constraints. It is, therefore, important to develop an adaptive approach that maximizes the inference performance of ML applications under timing constraints and the resource constraints of IoT devices. In this paper, we use image classification as our target application and propose progressive neural compression (PNC) as an efficient solution to this problem. Although neural compression has been used to compress images for different ML applications, existing solutions often produce fixed-size outputs that are unsuitable for timing-constrained offloading over variable bandwidth. To address this limitation, we train a multi-objective rateless autoencoder that optimizes for multiple compression rates via stochastic taildrop to create a compression solution that produces features ordered according to their importance to inference performance. Features are then transmitted in that order based on available bandwidth, with classification ultimately performed using the (sub)set of features received by the deadline. We demonstrate the benefits of PNC over state-of-the-art neural compression approaches and traditional compression methods on a testbed comprising an IoT device and an edge server connected over a wireless network with varying bandwidth.

Related papers

Multi-Scale Invertible Neural Network for Wide-Range Variable-Rate Learned Image Compression [90.59962443790593]
In this paper, we present a variable-rate image compression model based on invertible transform to overcome limitations. Specifically, we design a lightweight multi-scale invertible neural network, which maps the input image into multi-scale latent representations. Experimental results demonstrate that the proposed method achieves state-of-the-art performance compared to existing variable-rate methods.
arXiv Detail & Related papers (2025-03-27T09:08:39Z)
Task-Oriented Feature Compression for Multimodal Understanding via Device-Edge Co-Inference [49.77734021302196]
We propose a task-oriented feature compression (TOFC) method for multimodal understanding in a device-edge co-inference framework. To enhance compression efficiency, multiple entropy models are adaptively selected based on the characteristics of the visual features. Results show that TOFC achieves up to 60% reduction in data transmission overhead and 50% reduction in system latency.
arXiv Detail & Related papers (2025-03-17T08:37:22Z)
Continuous Patch Stitching for Block-wise Image Compression [56.97857167461269]
We propose a novel continuous patch stitching (CPS) framework for block-wise image compression. Our CPS framework achieves the state-of-the-art performance against existing baselines, whilst requiring less than half of computing resources of existing models.
arXiv Detail & Related papers (2025-02-24T03:11:59Z)
UniCompress: Enhancing Multi-Data Medical Image Compression with Knowledge Distillation [59.3877309501938]
Implicit Neural Representation (INR) networks have shown remarkable versatility due to their flexible compression ratios. We introduce a codebook containing frequency domain information as a prior input to the INR network. This enhances the representational power of INR and provides distinctive conditioning for different image blocks.
arXiv Detail & Related papers (2024-05-27T05:52:13Z)
AdaBM: On-the-Fly Adaptive Bit Mapping for Image Super-Resolution [53.23803932357899]
We introduce the first on-the-fly adaptive quantization framework that accelerates the processing time from hours to seconds. We achieve competitive performance with the previous adaptive quantization methods, while the processing time is accelerated by x2000.
arXiv Detail & Related papers (2024-04-04T08:37:27Z)
Federated learning compression designed for lightweight communications [0.0]
Federated Learning (FL) is a promising distributed machine learning method for edge-level machine learning. In this paper, we investigate the impact of compression techniques on FL for a typical image classification task.
arXiv Detail & Related papers (2023-10-23T08:36:21Z)
Bandwidth-efficient Inference for Neural Image Compression [26.87198174202502]
We propose an end-to-end differentiable bandwidth efficient neural inference method with the activation compressed by neural data compression method. Optimized with existing model quantization methods, low-level task of image compression can achieve up to 19x bandwidth reduction with 6.21x energy saving.
arXiv Detail & Related papers (2023-09-06T09:31:37Z)
FrankenSplit: Efficient Neural Feature Compression with Shallow Variational Bottleneck Injection for Mobile Edge Computing [5.815300670677979]
We introduce a novel framework for resource-conscious compression models and extensively evaluate our method in an asymmetric environment. Our method achieves 60% lower than a state-of-the-art SC method without decreasing accuracy and is up 16x faster than offloading with existing standards.
arXiv Detail & Related papers (2023-02-21T14:03:22Z)
Analysis of the Effect of Low-Overhead Lossy Image Compression on the Performance of Visual Crowd Counting for Smart City Applications [78.55896581882595]
Lossy image compression techniques can reduce the quality of the images, leading to accuracy degradation. In this paper, we analyze the effect of applying low-overhead lossy image compression methods on the accuracy of visual crowd counting.
arXiv Detail & Related papers (2022-07-20T19:20:03Z)
An Adaptive Device-Edge Co-Inference Framework Based on Soft Actor-Critic [72.35307086274912]
High-dimension parameter model and large-scale mathematical calculation restrict execution efficiency, especially for Internet of Things (IoT) devices. We propose a new Deep Reinforcement Learning (DRL)-Soft Actor Critic for discrete (SAC-d), which generates the emphexit point, emphexit point, and emphcompressing bits by soft policy iterations. Based on the latency and accuracy aware reward design, such an computation can well adapt to the complex environment like dynamic wireless channel and arbitrary processing, and is capable of supporting the 5G URL
arXiv Detail & Related papers (2022-01-09T09:31:50Z)
Implicit Neural Representations for Image Compression [103.78615661013623]
Implicit Neural Representations (INRs) have gained attention as a novel and effective representation for various data types. We propose the first comprehensive compression pipeline based on INRs including quantization, quantization-aware retraining and entropy coding. We find that our approach to source compression with INRs vastly outperforms similar prior work.
arXiv Detail & Related papers (2021-12-08T13:02:53Z)
COMET: A Novel Memory-Efficient Deep Learning Training Framework by Using Error-Bounded Lossy Compression [8.080129426746288]
Training wide and deep neural networks (DNNs) require large amounts of storage resources such as memory. We propose a memory-efficient CNN training framework (called COMET) that leverages error-bounded lossy compression. Our framework can significantly reduce the training memory consumption by up to 13.5X over the baseline training and 1.8X over another state-of-the-art compression-based framework.
arXiv Detail & Related papers (2021-11-18T07:43:45Z)
ALF: Autoencoder-based Low-rank Filter-sharing for Efficient Convolutional Neural Networks [63.91384986073851]
We propose the autoencoder-based low-rank filter-sharing technique technique (ALF) ALF shows a reduction of 70% in network parameters, 61% in operations and 41% in execution time, with minimal loss in accuracy.
arXiv Detail & Related papers (2020-07-27T09:01:22Z)
Dynamic Compression Ratio Selection for Edge Inference Systems with Hard Deadlines [9.585931043664363]
We propose a dynamic compression ratio selection scheme for edge inference system with hard deadlines. Information augmentation that retransmits less compressed data of task with erroneous inference is proposed to enhance the accuracy performance. Considering the wireless transmission errors, we further design a retransmission scheme to reduce performance degradation due to packet losses.
arXiv Detail & Related papers (2020-05-25T17:11:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.