Progressive Learning with Visual Prompt Tuning for Variable-Rate Image
Compression
- URL: http://arxiv.org/abs/2311.13846v2
- Date: Tue, 28 Nov 2023 14:31:43 GMT
- Title: Progressive Learning with Visual Prompt Tuning for Variable-Rate Image
Compression
- Authors: Shiyu Qin, Yimin Zhou, Jinpeng Wang, Bin Chen, Baoyi An, Tao Dai,
Shu-Tao Xia
- Abstract summary: We propose a progressive learning paradigm for transformer-based variable-rate image compression.
Inspired by visual prompt tuning, we use LPM to extract prompts for input images and hidden features at the encoder side and decoder side, respectively.
Our model outperforms all current variable image methods in terms of rate-distortion performance and approaches the state-of-the-art fixed image compression methods trained from scratch.
- Score: 60.689646881479064
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: In this paper, we propose a progressive learning paradigm for
transformer-based variable-rate image compression. Our approach covers a wide
range of compression rates with the assistance of the Layer-adaptive Prompt
Module (LPM). Inspired by visual prompt tuning, we use LPM to extract prompts
for input images and hidden features at the encoder side and decoder side,
respectively, which are fed as additional information into the Swin Transformer
layer of a pre-trained transformer-based image compression model to affect the
allocation of attention region and the bits, which in turn changes the target
compression ratio of the model. To ensure the network is more lightweight, we
involves the integration of prompt networks with less convolutional layers.
Exhaustive experiments show that compared to methods based on multiple models,
which are optimized separately for different target rates, the proposed method
arrives at the same performance with 80% savings in parameter storage and 90%
savings in datasets. Meanwhile, our model outperforms all current variable
bitrate image methods in terms of rate-distortion performance and approaches
the state-of-the-art fixed bitrate image compression methods trained from
scratch.
Related papers
- Bi-Level Spatial and Channel-aware Transformer for Learned Image Compression [0.0]
We propose a novel Transformer-based image compression method that enhances the transformation stage by considering frequency components within the feature map.
Our method integrates a novel Hybrid Spatial-Channel Attention Transformer Block (HSCATB), where a spatial-based branch independently handles high and low frequencies.
We also introduce a Mixed Local-Global Feed Forward Network (MLGFFN) within the Transformer block to enhance the extraction of diverse and rich information.
arXiv Detail & Related papers (2024-08-07T15:35:25Z) - Transferable Learned Image Compression-Resistant Adversarial Perturbations [66.46470251521947]
Adversarial attacks can readily disrupt the image classification system, revealing the vulnerability of DNN-based recognition tasks.
We introduce a new pipeline that targets image classification models that utilize learned image compressors as pre-processing modules.
arXiv Detail & Related papers (2024-01-06T03:03:28Z) - Dynamic Low-Rank Instance Adaptation for Universal Neural Image
Compression [33.92792778925365]
We propose a low-rank adaptation approach to address the rate-distortion drop observed in out-of-domain datasets.
Our proposed method exhibits universality across diverse image datasets.
arXiv Detail & Related papers (2023-08-15T12:17:46Z) - Transformer-based Variable-rate Image Compression with
Region-of-interest Control [24.794581811606445]
This paper proposes a transformer-based learned image compression system.
It is capable of achieving variable-rate compression with a single model while supporting the region-of-interest functionality.
arXiv Detail & Related papers (2023-05-18T08:40:34Z) - High-Fidelity Variable-Rate Image Compression via Invertible Activation
Transformation [24.379052026260034]
We propose the Invertible Activation Transformation (IAT) module to tackle the issue of high-fidelity fine variable-rate image compression.
IAT and QLevel together give the image compression model the ability of fine variable-rate control while better maintaining the image fidelity.
Our method outperforms the state-of-the-art variable-rate image compression method by a large margin, especially after multiple re-encodings.
arXiv Detail & Related papers (2022-09-12T07:14:07Z) - Estimating the Resize Parameter in End-to-end Learned Image Compression [50.20567320015102]
We describe a search-free resizing framework that can further improve the rate-distortion tradeoff of recent learned image compression models.
Our results show that our new resizing parameter estimation framework can provide Bjontegaard-Delta rate (BD-rate) improvement of about 10% against leading perceptual quality engines.
arXiv Detail & Related papers (2022-04-26T01:35:02Z) - Towards End-to-End Image Compression and Analysis with Transformers [99.50111380056043]
We propose an end-to-end image compression and analysis model with Transformers, targeting to the cloud-based image classification application.
We aim to redesign the Vision Transformer (ViT) model to perform image classification from the compressed features and facilitate image compression with the long-term information from the Transformer.
Experimental results demonstrate the effectiveness of the proposed model in both the image compression and the classification tasks.
arXiv Detail & Related papers (2021-12-17T03:28:14Z) - Variable-Rate Deep Image Compression through Spatially-Adaptive Feature
Transform [58.60004238261117]
We propose a versatile deep image compression network based on Spatial Feature Transform (SFT arXiv:1804.02815)
Our model covers a wide range of compression rates using a single model, which is controlled by arbitrary pixel-wise quality maps.
The proposed framework allows us to perform task-aware image compressions for various tasks.
arXiv Detail & Related papers (2021-08-21T17:30:06Z) - Modeling Lost Information in Lossy Image Compression [72.69327382643549]
Lossy image compression is one of the most commonly used operators for digital images.
We propose a novel invertible framework called Invertible Lossy Compression (ILC) to largely mitigate the information loss problem.
arXiv Detail & Related papers (2020-06-22T04:04:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.