Thousand to One: Semantic Prior Modeling for Conceptual Coding
- URL: http://arxiv.org/abs/2103.07131v2
- Date: Tue, 16 Mar 2021 02:57:07 GMT
- Title: Thousand to One: Semantic Prior Modeling for Conceptual Coding
- Authors: Jianhui Chang, Zhenghui Zhao, Lingbo Yang, Chuanmin Jia, Jian Zhang,
Siwei Ma
- Abstract summary: We propose an end-to-end semantic prior-based conceptual coding scheme towards extremely low image compression.
We employ semantic segmentation maps as structural guidance for extracting deep semantic prior.
A cross-channel entropy model is proposed to further exploit the inter-channel correlation of the spatially independent semantic prior.
- Score: 26.41657489930382
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Conceptual coding has been an emerging research topic recently, which encodes
natural images into disentangled conceptual representations for compression.
However, the compression performance of the existing methods is still
sub-optimal due to the lack of comprehensive consideration of rate constraint
and reconstruction quality. To this end, we propose a novel end-to-end semantic
prior modeling-based conceptual coding scheme towards extremely low bitrate
image compression, which leverages semantic-wise deep representations as a
unified prior for entropy estimation and texture synthesis. Specifically, we
employ semantic segmentation maps as structural guidance for extracting deep
semantic prior, which provides fine-grained texture distribution modeling for
better detail construction and higher flexibility in subsequent high-level
vision tasks. Moreover, a cross-channel entropy model is proposed to further
exploit the inter-channel correlation of the spatially independent semantic
prior, leading to more accurate entropy estimation for rate-constrained
training. The proposed scheme achieves an ultra-high 1000x compression ratio,
while still enjoying high visual reconstruction quality and versatility towards
visual processing and analysis tasks.
Related papers
- Graph-Boosted Attentive Network for Semantic Body Parsing [1.4042211166197214]
This paper proposes a novel approach to decomposing multiple human bodies into semantic part regions in unconstrained environments.
We propose a convolutional neural network architecture which comprises of novel semantic and contour attention mechanisms across feature hierarchy.
Our proposed method achieves the state-of-art results on the challenging Pascal Person-Part dataset.
arXiv Detail & Related papers (2024-07-08T13:32:01Z) - Coarse-to-Fine Latent Diffusion for Pose-Guided Person Image Synthesis [65.7968515029306]
We propose a novel Coarse-to-Fine Latent Diffusion (CFLD) method for Pose-Guided Person Image Synthesis (PGPIS)
A perception-refined decoder is designed to progressively refine a set of learnable queries and extract semantic understanding of person images as a coarse-grained prompt.
arXiv Detail & Related papers (2024-02-28T06:07:07Z) - JoReS-Diff: Joint Retinex and Semantic Priors in Diffusion Model for Low-light Image Enhancement [69.6035373784027]
Low-light image enhancement (LLIE) has achieved promising performance by employing conditional diffusion models.
Previous methods may neglect the importance of a sufficient formulation of task-specific condition strategy.
We propose JoReS-Diff, a novel approach that incorporates Retinex- and semantic-based priors as the additional pre-processing condition.
arXiv Detail & Related papers (2023-12-20T08:05:57Z) - Corner-to-Center Long-range Context Model for Efficient Learned Image
Compression [70.0411436929495]
In the framework of learned image compression, the context model plays a pivotal role in capturing the dependencies among latent representations.
We propose the textbfCorner-to-Center transformer-based Context Model (C$3$M) designed to enhance context and latent predictions.
In addition, to enlarge the receptive field in the analysis and synthesis transformation, we use the Long-range Crossing Attention Module (LCAM) in the encoder/decoder.
arXiv Detail & Related papers (2023-11-29T21:40:28Z) - Joint Hierarchical Priors and Adaptive Spatial Resolution for Efficient
Neural Image Compression [11.25130799452367]
We propose an absolute image compression transformer (ICT) for neural image compression (NIC)
ICT captures both global and local contexts from the latent representations and better parameterize the distribution of the quantized latents.
Our framework significantly improves the trade-off between coding efficiency and decoder complexity over the versatile video coding (VVC) reference encoder (VTM-18.0) and the neural SwinT-ChARM.
arXiv Detail & Related papers (2023-07-05T13:17:14Z) - Semantic Image Synthesis via Diffusion Models [159.4285444680301]
Denoising Diffusion Probabilistic Models (DDPMs) have achieved remarkable success in various image generation tasks.
Recent work on semantic image synthesis mainly follows the emphde facto Generative Adversarial Nets (GANs)
arXiv Detail & Related papers (2022-06-30T18:31:51Z) - Post-Training Quantization for Cross-Platform Learned Image Compression [15.67527732099067]
It has been witnessed that learned image compression has outperformed conventional image coding techniques.
One of the most critical issues that need to be considered is the non-deterministic calculation.
We propose to solve this problem by introducing well-developed post-training quantization.
arXiv Detail & Related papers (2022-02-15T15:41:12Z) - Implicit Neural Representations for Image Compression [103.78615661013623]
Implicit Neural Representations (INRs) have gained attention as a novel and effective representation for various data types.
We propose the first comprehensive compression pipeline based on INRs including quantization, quantization-aware retraining and entropy coding.
We find that our approach to source compression with INRs vastly outperforms similar prior work.
arXiv Detail & Related papers (2021-12-08T13:02:53Z) - Causal Contextual Prediction for Learned Image Compression [36.08393281509613]
We propose the concept of separate entropy coding to leverage a serial decoding process for causal contextual entropy prediction in the latent space.
A causal context model is proposed that separates the latents across channels and makes use of cross-channel relationships to generate highly informative contexts.
We also propose a causal global prediction model, which is able to find global reference points for accurate predictions of unknown points.
arXiv Detail & Related papers (2020-11-19T08:15:10Z) - Towards Analysis-friendly Face Representation with Scalable Feature and
Texture Compression [113.30411004622508]
We show that a universal and collaborative visual information representation can be achieved in a hierarchical way.
Based on the strong generative capability of deep neural networks, the gap between the base feature layer and enhancement layer is further filled with the feature level texture reconstruction.
To improve the efficiency of the proposed framework, the base layer neural network is trained in a multi-task manner.
arXiv Detail & Related papers (2020-04-21T14:32:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.