Related papers: Pruned Lightweight Encoders for Computer Vision

Pruned Lightweight Encoders for Computer Vision

URL: http://arxiv.org/abs/2211.13137v1
Date: Wed, 23 Nov 2022 17:11:48 GMT
Title: Pruned Lightweight Encoders for Computer Vision
Authors: Jakub \v{Z}\'adn\'ik, Markku M\"akitalo, Pekka J\"a\"askel\"ainen
Abstract summary: We show that ASTC and JPEG XS encoding configurations can be used on a near-sensor edge device to ensure low latency. We reduced the classification accuracy and segmentation mean over union (mIoU) degradation due to ASTC compression to 4.9-5.0 percentage points (pp) and 4.4-4.0 pp, respectively. In terms of encoding speed, our ASTC encoder implementation is 2.3x faster than JPEG.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Latency-critical computer vision systems, such as autonomous driving or drone control, require fast image or video compression when offloading neural network inference to a remote computer. To ensure low latency on a near-sensor edge device, we propose the use of lightweight encoders with constant bitrate and pruned encoding configurations, namely, ASTC and JPEG XS. Pruning introduces significant distortion which we show can be recovered by retraining the neural network with compressed data after decompression. Such an approach does not modify the network architecture or require coding format modifications. By retraining with compressed datasets, we reduced the classification accuracy and segmentation mean intersection over union (mIoU) degradation due to ASTC compression to 4.9-5.0 percentage points (pp) and 4.4-4.0 pp, respectively. With the same method, the mIoU lost due to JPEG XS compression at the main profile was restored to 2.7-2.3 pp. In terms of encoding speed, our ASTC encoder implementation is 2.3x faster than JPEG. Even though the JPEG XS reference encoder requires optimizations to reach low latency, we showed that disabling significance flag coding saves 22-23% of encoding time at the cost of 0.4-0.3 mIoU after retraining.

Related papers

SIEDD: Shared-Implicit Encoder with Discrete Decoders [36.705337163276255]
Implicit Neural Representations (INRs) offer exceptional fidelity for video compression by learning per-video optimized functions.<n>Existing attempts to accelerate INR encoding often sacrifice reconstruction quality or crucial coordinate-level control.<n>We introduce SIEDD, a novel architecture that fundamentally accelerates INR encoding without these compromises.
arXiv Detail & Related papers (2025-06-29T19:39:43Z)
Generative Latent Coding for Ultra-Low Bitrate Image and Video Compression [61.500904231491596]
Most approaches for image and video compression perform transform coding in the pixel space to reduce redundancy.<n>We propose textbfGenerative textbfLatent textbfCoding (textbfGLC) models for image and video compression, GLC-image and GLC-Video.
arXiv Detail & Related papers (2025-05-22T03:31:33Z)
Embedding Compression Distortion in Video Coding for Machines [67.97469042910855]
Currently, video transmission serves not only the Human Visual System (HVS) for viewing but also machine perception for analysis. We propose a Compression Distortion Embedding (CDRE) framework, which extracts machine-perception-related distortion representation and embeds it into downstream models. Our framework can effectively boost the rate-task performance of existing codecs with minimal overhead in terms of execution time, and number of parameters.
arXiv Detail & Related papers (2025-03-27T13:01:53Z)
JDEC: JPEG Decoding via Enhanced Continuous Cosine Coefficients [17.437568540883106]
We propose a practical approach to JPEG image decoding, utilizing a local implicit neural representation with continuous cosine formulation. Our proposed network achieves state-of-the-art performance in flexible color image JPEG artifact removal tasks.
arXiv Detail & Related papers (2024-04-03T03:28:04Z)
Deep Lossy Plus Residual Coding for Lossless and Near-lossless Image Compression [85.93207826513192]
We propose a unified and powerful deep lossy plus residual (DLPR) coding framework for both lossless and near-lossless image compression. We solve the joint lossy and residual compression problem in the approach of VAEs. In the near-lossless mode, we quantize the original residuals to satisfy a given $ell_infty$ error bound.
arXiv Detail & Related papers (2022-09-11T12:11:56Z)
Reducing Redundancy in the Bottleneck Representation of the Autoencoders [98.78384185493624]
Autoencoders are a type of unsupervised neural networks, which can be used to solve various tasks. We propose a scheme to explicitly penalize feature redundancies in the bottleneck representation. We tested our approach across different tasks: dimensionality reduction using three different dataset, image compression using the MNIST dataset, and image denoising using fashion MNIST.
arXiv Detail & Related papers (2022-02-09T18:48:02Z)
Neural JPEG: End-to-End Image Compression Leveraging a Standard JPEG Encoder-Decoder [73.48927855855219]
We propose a system that learns to improve the encoding performance by enhancing its internal neural representations on both the encoder and decoder ends. Experiments demonstrate that our approach successfully improves the rate-distortion performance over JPEG across various quality metrics.
arXiv Detail & Related papers (2022-01-27T20:20:03Z)
Deep Video Coding with Dual-Path Generative Adversarial Network [39.19042551896408]
This paper proposes an efficient codecs namely dual-path generative adversarial network-based video (DGVC) Our DGVC reduces the average bit-per-pixel (bpp) by 39.39%/54.92% at the same PSNR/MS-SSIM.
arXiv Detail & Related papers (2021-11-29T11:39:28Z)
Image Compression with Encoder-Decoder Matched Semantic Segmentation [15.536056887418676]
layered image compression is a promising direction. Some works transmit the semantic segment together with the compressed image data. We propose a new layered image compression framework with encoder matched semantic segmentation (EDMS) The proposed EDMS framework can get up to 35.31% BD-rate reduction over the HEVC-based (BPG) encoding time saving.
arXiv Detail & Related papers (2021-01-24T04:11:05Z)
Learning to Improve Image Compression without Changing the Standard Decoder [100.32492297717056]
We propose learning to improve the encoding performance with the standard decoder. Specifically, a frequency-domain pre-editing method is proposed to optimize the distribution of DCT coefficients. We do not modify the JPEG decoder and therefore our approach is applicable when viewing images with the widely used standard JPEG decoder.
arXiv Detail & Related papers (2020-09-27T19:24:42Z)
Content Adaptive and Error Propagation Aware Deep Video Compression [110.31693187153084]
We propose a content adaptive and error propagation aware video compression system. Our method employs a joint training strategy by considering the compression performance of multiple consecutive frames instead of a single frame. Instead of using the hand-crafted coding modes in the traditional compression systems, we design an online encoder updating scheme in our system.
arXiv Detail & Related papers (2020-03-25T09:04:24Z)
Learning Better Lossless Compression Using Lossy Compression [100.50156325096611]
We leverage the powerful lossy image compression algorithm BPG to build a lossless image compression system. We model the distribution of the residual with a convolutional neural network-based probabilistic model that is conditioned on the BPG reconstruction. Finally, the image is stored using the concatenation of the bitstreams produced by BPG and the learned residual coder.
arXiv Detail & Related papers (2020-03-23T11:21:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.