SigVIC: Spatial Importance Guided Variable-Rate Image Compression
- URL: http://arxiv.org/abs/2303.09112v1
- Date: Thu, 16 Mar 2023 06:57:51 GMT
- Title: SigVIC: Spatial Importance Guided Variable-Rate Image Compression
- Authors: Jiaming Liang, Meiqin Liu, Chao Yao, Chunyu Lin, Yao Zhao
- Abstract summary: variable-rate mechanism has improved the flexibility and efficiency of learning-based image compression.
One of the most common approaches for variable-rate is to channel-wisely or spatial-uniformly scale the internal features.
We introduce a Spatial Importance Guided Variable-rate Image Compression (SigVIC), in which a spatial gating unit (SGU) is designed for adaptively learning a spatial importance mask.
- Score: 43.062173445454775
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Variable-rate mechanism has improved the flexibility and efficiency of
learning-based image compression that trains multiple models for different
rate-distortion tradeoffs. One of the most common approaches for variable-rate
is to channel-wisely or spatial-uniformly scale the internal features. However,
the diversity of spatial importance is instructive for bit allocation of image
compression. In this paper, we introduce a Spatial Importance Guided
Variable-rate Image Compression (SigVIC), in which a spatial gating unit (SGU)
is designed for adaptively learning a spatial importance mask. Then, a spatial
scaling network (SSN) takes the spatial importance mask to guide the feature
scaling and bit allocation for variable-rate. Moreover, to improve the quality
of decoded image, Top-K shallow features are selected to refine the decoded
features through a shallow feature fusion module (SFFM). Experiments show that
our method outperforms other learning-based methods (whether variable-rate or
not) and traditional codecs, with storage saving and high flexibility.
Related papers
- SQ-GAN: Semantic Image Communications Using Masked Vector Quantization [55.02795214161371]
This work introduces Semantically Masked VQ-GAN (SQ-GAN), a novel approach to optimize image compression for semantic/task-oriented communications.
SQ-GAN employs off-the-shelf semantic semantic segmentation and a new semantic-conditioned adaptive mask module (SAMM) to selectively encode semantically significant features of the images.
arXiv Detail & Related papers (2025-02-13T17:35:57Z) - DeepFGS: Fine-Grained Scalable Coding for Learned Image Compression [27.834491128701963]
This paper proposes a learned fine-grained scalable image compression framework, namely DeepFGS.
For entropy coding, we design a mutual entropy model to fully explore the correlation between the basic and scalable features.
Experiments demonstrate that our proposed DeepFGS outperforms previous learning-based scalable image compression models.
arXiv Detail & Related papers (2024-11-30T11:19:38Z) - Multi-scale Unified Network for Image Classification [33.560003528712414]
CNNs face notable challenges in performance and computational efficiency when dealing with real-world, multi-scale image inputs.
We propose Multi-scale Unified Network (MUSN) consisting of multi-scales, a unified network, and scale-invariant constraint.
MUSN yields an accuracy increase up to 44.53% and diminishes FLOPs by 7.01-16.13% in multi-scale scenarios.
arXiv Detail & Related papers (2024-03-27T06:40:26Z) - Progressive Learning with Visual Prompt Tuning for Variable-Rate Image
Compression [60.689646881479064]
We propose a progressive learning paradigm for transformer-based variable-rate image compression.
Inspired by visual prompt tuning, we use LPM to extract prompts for input images and hidden features at the encoder side and decoder side, respectively.
Our model outperforms all current variable image methods in terms of rate-distortion performance and approaches the state-of-the-art fixed image compression methods trained from scratch.
arXiv Detail & Related papers (2023-11-23T08:29:32Z) - Multiscale Augmented Normalizing Flows for Image Compression [17.441496966834933]
We present a novel concept, which adapts the hierarchical latent space for augmented normalizing flows, an invertible latent variable model.
Our best performing model achieved average rate savings of more than 7% over comparable single-scale models.
arXiv Detail & Related papers (2023-05-09T13:42:43Z) - Variable-Rate Deep Image Compression through Spatially-Adaptive Feature
Transform [58.60004238261117]
We propose a versatile deep image compression network based on Spatial Feature Transform (SFT arXiv:1804.02815)
Our model covers a wide range of compression rates using a single model, which is controlled by arbitrary pixel-wise quality maps.
The proposed framework allows us to perform task-aware image compressions for various tasks.
arXiv Detail & Related papers (2021-08-21T17:30:06Z) - Learned Multi-Resolution Variable-Rate Image Compression with
Octave-based Residual Blocks [15.308823742699039]
We propose a new variable-rate image compression framework, which employs generalized octave convolutions (GoConv) and generalized octave transposed-convolutions (GoTConv)
To enable a single model to operate with different bit rates and to learn multi-rate image features, a new objective function is introduced.
Experimental results show that the proposed framework trained with variable-rate objective function outperforms the standard codecs such as H.265/HEVC-based BPG and state-of-the-art learning-based variable-rate methods.
arXiv Detail & Related papers (2020-12-31T06:26:56Z) - Modeling Lost Information in Lossy Image Compression [72.69327382643549]
Lossy image compression is one of the most commonly used operators for digital images.
We propose a novel invertible framework called Invertible Lossy Compression (ILC) to largely mitigate the information loss problem.
arXiv Detail & Related papers (2020-06-22T04:04:56Z) - Learning Enriched Features for Real Image Restoration and Enhancement [166.17296369600774]
convolutional neural networks (CNNs) have achieved dramatic improvements over conventional approaches for image restoration task.
We present a novel architecture with the collective goals of maintaining spatially-precise high-resolution representations through the entire network.
Our approach learns an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details.
arXiv Detail & Related papers (2020-03-15T11:04:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.