Learned Image Compression with Generalized Octave Convolution and
Cross-Resolution Parameter Estimation
- URL: http://arxiv.org/abs/2209.03353v1
- Date: Wed, 7 Sep 2022 08:21:52 GMT
- Title: Learned Image Compression with Generalized Octave Convolution and
Cross-Resolution Parameter Estimation
- Authors: Haisheng Fu, Feng Liang
- Abstract summary: We propose a learned multi-resolution image compression framework, which exploits octave convolutions to factorize the latent representations into the high-resolution (HR) and low-resolution (LR) parts.
Experimental results show that our method separately reduces the decoding time by approximately 73.35 % and 93.44 % compared with that of state-of-the-art learned image compression methods.
- Score: 5.238765582868391
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The application of the context-adaptive entropy model significantly improves
the rate-distortion (R-D) performance, in which hyperpriors and autoregressive
models are jointly utilized to effectively capture the spatial redundancy of
the latent representations. However, the latent representations still contain
some spatial correlations. In addition, these methods based on the
context-adaptive entropy model cannot be accelerated in the decoding process by
parallel computing devices, e.g. FPGA or GPU. To alleviate these limitations,
we propose a learned multi-resolution image compression framework, which
exploits the recently developed octave convolutions to factorize the latent
representations into the high-resolution (HR) and low-resolution (LR) parts,
similar to wavelet transform, which further improves the R-D performance. To
speed up the decoding, our scheme does not use context-adaptive entropy model.
Instead, we exploit an additional hyper layer including hyper encoder and hyper
decoder to further remove the spatial redundancy of the latent representation.
Moreover, the cross-resolution parameter estimation (CRPE) is introduced into
the proposed framework to enhance the flow of information and further improve
the rate-distortion performance. An additional information-fidelity loss is
proposed to the total loss function to adjust the contribution of the LR part
to the final bit stream. Experimental results show that our method separately
reduces the decoding time by approximately 73.35 % and 93.44 % compared with
that of state-of-the-art learned image compression methods, and the R-D
performance is still better than H.266/VVC(4:2:0) and some learning-based
methods on both PSNR and MS-SSIM metrics across a wide bit rates.
Related papers
- Test-time adaptation for image compression with distribution regularization [43.490138269939344]
We introduce a simple Bayesian approximation-endowed textit distribution regularization to encourage learning a better joint probability approximation in a plug-and-play manner.
Our proposed method not only improves the R-D performance compared with other latent refinement counterparts, but also can be flexibly integrated into existing TTA-IC methods with incremental benefits.
arXiv Detail & Related papers (2024-10-16T03:25:16Z) - Binarized Diffusion Model for Image Super-Resolution [61.963833405167875]
Binarization, an ultra-compression algorithm, offers the potential for effectively accelerating advanced diffusion models (DMs)
Existing binarization methods result in significant performance degradation.
We introduce a novel binarized diffusion model, BI-DiffSR, for image SR.
arXiv Detail & Related papers (2024-06-09T10:30:25Z) - Efficient Real-world Image Super-Resolution Via Adaptive Directional Gradient Convolution [80.85121353651554]
We introduce kernel-wise differential operations within the convolutional kernel and develop several learnable directional gradient convolutions.
These convolutions are integrated in parallel with a novel linear weighting mechanism to form an Adaptive Directional Gradient Convolution (DGConv)
We further devise an Adaptive Information Interaction Block (AIIBlock) to adeptly balance the enhancement of texture and contrast while meticulously investigating the interdependencies, culminating in the creation of a DGPNet for Real-SR through simple stacking.
arXiv Detail & Related papers (2024-05-11T14:21:40Z) - Adaptive Semantic-Enhanced Denoising Diffusion Probabilistic Model for Remote Sensing Image Super-Resolution [7.252121550658619]
Denoising Diffusion Probabilistic Model (DDPM) has shown promising performance in image reconstructions.
High-frequency details generated by DDPM often suffer from misalignment with HR images due to model's tendency to overlook long-range semantic contexts.
An adaptive semantic-enhanced DDPM (ASDDPM) is proposed to enhance the detail-preserving capability of the DDPM.
arXiv Detail & Related papers (2024-03-17T04:08:58Z) - DGNet: Dynamic Gradient-Guided Network for Water-Related Optics Image
Enhancement [77.0360085530701]
Underwater image enhancement (UIE) is a challenging task due to the complex degradation caused by underwater environments.
Previous methods often idealize the degradation process, and neglect the impact of medium noise and object motion on the distribution of image features.
Our approach utilizes predicted images to dynamically update pseudo-labels, adding a dynamic gradient to optimize the network's gradient space.
arXiv Detail & Related papers (2023-12-12T06:07:21Z) - Dynamic Kernel-Based Adaptive Spatial Aggregation for Learned Image
Compression [63.56922682378755]
We focus on extending spatial aggregation capability and propose a dynamic kernel-based transform coding.
The proposed adaptive aggregation generates kernel offsets to capture valid information in the content-conditioned range to help transform.
Experimental results demonstrate that our method achieves superior rate-distortion performance on three benchmarks compared to the state-of-the-art learning-based methods.
arXiv Detail & Related papers (2023-08-17T01:34:51Z) - LLIC: Large Receptive Field Transform Coding with Adaptive Weights for Learned Image Compression [27.02281402358164]
We propose Large Receptive Field Transform Coding with Adaptive Weights for Learned Image Compression.
We introduce a few large kernelbased depth-wise convolutions to reduce more redundancy while maintaining modest complexity.
Our LLIC models achieve state-of-the-art performances and better trade-offs between performance and complexity.
arXiv Detail & Related papers (2023-04-19T11:19:10Z) - DCS-RISR: Dynamic Channel Splitting for Efficient Real-world Image
Super-Resolution [15.694407977871341]
Real-world image super-resolution (RISR) has received increased focus for improving the quality of SR images under unknown complex degradation.
Existing methods rely on the heavy SR models to enhance low-resolution (LR) images of different degradation levels.
We propose a novel Dynamic Channel Splitting scheme for efficient Real-world Image Super-Resolution, termed DCS-RISR.
arXiv Detail & Related papers (2022-12-15T04:34:57Z) - Learning True Rate-Distortion-Optimization for End-To-End Image
Compression [59.816251613869376]
Rate-distortion optimization is crucial part of traditional image and video compression.
In this paper, we enhance the training by introducing low-complexity estimations of the RDO result into the training.
We achieve average rate savings of 19.6% in MS-SSIM over the previous RDONet model, which equals rate savings of 27.3% over a comparable conventional deep image coder.
arXiv Detail & Related papers (2022-01-05T13:02:00Z) - Uncovering the Over-smoothing Challenge in Image Super-Resolution: Entropy-based Quantification and Contrastive Optimization [67.99082021804145]
We propose an explicit solution to the COO problem, called Detail Enhanced Contrastive Loss (DECLoss)
DECLoss utilizes the clustering property of contrastive learning to directly reduce the variance of the potential high-resolution distribution.
We evaluate DECLoss on multiple super-resolution benchmarks and demonstrate that it improves the perceptual quality of PSNR-oriented models.
arXiv Detail & Related papers (2022-01-04T08:30:09Z) - Generalized Octave Convolutions for Learned Multi-Frequency Image
Compression [20.504561050200365]
We propose the first learned multi-frequency image compression and entropy coding approach.
It is based on the recently developed octave convolutions to factorize the latents into high and low frequency (resolution) components.
We show that the proposed generalized octave convolution can improve the performance of other auto-encoder-based computer vision tasks.
arXiv Detail & Related papers (2020-02-24T01:35:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.