Controlling Rate, Distortion, and Realism: Towards a Single Comprehensive Neural Image Compression Model
- URL: http://arxiv.org/abs/2405.16817v1
- Date: Mon, 27 May 2024 04:22:25 GMT
- Title: Controlling Rate, Distortion, and Realism: Towards a Single Comprehensive Neural Image Compression Model
- Authors: Shoma Iwai, Tomo Miyazaki, Shinichiro Omachi,
- Abstract summary: We propose a variable-rate generative NIC model to compress images to different bit rates.
By incorporating the newly proposed multi-realism technique, our method allows the users to adjust the bit rate, distortion, and realism with a single model.
Our method matches or surpasses the performance of state-of-the-art single-rate generative NIC models.
- Score: 4.096453902709292
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In recent years, neural network-driven image compression (NIC) has gained significant attention. Some works adopt deep generative models such as GANs and diffusion models to enhance perceptual quality (realism). A critical obstacle of these generative NIC methods is that each model is optimized for a single bit rate. Consequently, multiple models are required to compress images to different bit rates, which is impractical for real-world applications. To tackle this issue, we propose a variable-rate generative NIC model. Specifically, we explore several discriminator designs tailored for the variable-rate approach and introduce a novel adversarial loss. Moreover, by incorporating the newly proposed multi-realism technique, our method allows the users to adjust the bit rate, distortion, and realism with a single model, achieving ultra-controllability. Unlike existing variable-rate generative NIC models, our method matches or surpasses the performance of state-of-the-art single-rate generative NIC models while covering a wide range of bit rates using just one model. Code will be available at https://github.com/iwa-shi/CRDR
Related papers
- PAC-FNO: Parallel-Structured All-Component Fourier Neural Operators for Recognizing Low-Quality Images [38.773390121161924]
We propose a novel neural network model, parallel-structured and all-component Fourier neural operator (PAC-FNO)
Unlike conventional feed-forward neural networks, PAC-FNO operates in the frequency domain, allowing it to handle images of varying resolutions within a single model.
We show that the proposed PAC-FNO improves the performance of existing baseline models on images with various resolutions by up to 77.1% and various types of natural variations in the images at inference.
arXiv Detail & Related papers (2024-02-20T05:06:20Z) - Load-balanced Gather-scatter Patterns for Sparse Deep Neural Networks [20.374784902476318]
Pruning, as a method to introduce zeros to model weights, has shown to be an effective method to provide good trade-offs between model accuracy and computation efficiency.
Some modern processors are equipped with fast on-chip scratchpad memories and gather/scatter engines that perform indirect load and store operations on such memories.
In this work, we propose a set of novel sparse patterns, named gather-scatter (GS) patterns, to utilize the scratchpad memories and gather/scatter engines to speed up neural network inferences.
arXiv Detail & Related papers (2021-12-20T22:55:45Z) - Meta Internal Learning [88.68276505511922]
Internal learning for single-image generation is a framework, where a generator is trained to produce novel images based on a single image.
We propose a meta-learning approach that enables training over a collection of images, in order to model the internal statistics of the sample image more effectively.
Our results show that the models obtained are as suitable as single-image GANs for many common image applications.
arXiv Detail & Related papers (2021-10-06T16:27:38Z) - Dynamic Proximal Unrolling Network for Compressive Sensing Imaging [29.00266254916676]
We present a dynamic proximal unrolling network (dubbed DPUNet), which can handle a variety of measurement matrices via one single model without retraining.
Specifically, DPUNet can exploit both embedded physical model via gradient descent and imposing image prior with learned dynamic proximal mapping.
Experimental results demonstrate that the proposed DPUNet can effectively handle multiple CSI modalities under varying sampling ratios and noise levels with only one model.
arXiv Detail & Related papers (2021-07-23T03:04:44Z) - Rate Distortion Characteristic Modeling for Neural Image Compression [59.25700168404325]
End-to-end optimization capability offers neural image compression (NIC) superior lossy compression performance.
distinct models are required to be trained to reach different points in the rate-distortion (R-D) space.
We make efforts to formulate the essential mathematical functions to describe the R-D behavior of NIC using deep network and statistical modeling.
arXiv Detail & Related papers (2021-06-24T12:23:05Z) - Fast and Robust Cascade Model for Multiple Degradation Single Image
Super-Resolution [2.1574781022415364]
Single Image Super-Resolution (SISR) is one of the low-level computer vision problems that has received increased attention in the last few years.
Here, we propose a new formulation of the Convolutional Neural Network (CNN) cascade model.
A new densely connected CNN-architecture is proposed where the output of each sub- module is restricted using some external knowledge.
arXiv Detail & Related papers (2020-11-16T18:59:49Z) - Encoding Robustness to Image Style via Adversarial Feature Perturbations [72.81911076841408]
We adapt adversarial training by directly perturbing feature statistics, rather than image pixels, to produce robust models.
Our proposed method, Adversarial Batch Normalization (AdvBN), is a single network layer that generates worst-case feature perturbations during training.
arXiv Detail & Related papers (2020-09-18T17:52:34Z) - Locally Masked Convolution for Autoregressive Models [107.4635841204146]
LMConv is a simple modification to the standard 2D convolution that allows arbitrary masks to be applied to the weights at each location in the image.
We learn an ensemble of distribution estimators that share parameters but differ in generation order, achieving improved performance on whole-image density estimation.
arXiv Detail & Related papers (2020-06-22T17:59:07Z) - Dynamic Model Pruning with Feedback [64.019079257231]
We propose a novel model compression method that generates a sparse trained model without additional overhead.
We evaluate our method on CIFAR-10 and ImageNet, and show that the obtained sparse models can reach the state-of-the-art performance of dense models.
arXiv Detail & Related papers (2020-06-12T15:07:08Z) - Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks.
We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.