Raising The Limit Of Image Rescaling Using Auxiliary Encoding
- URL: http://arxiv.org/abs/2303.06747v1
- Date: Sun, 12 Mar 2023 20:49:07 GMT
- Title: Raising The Limit Of Image Rescaling Using Auxiliary Encoding
- Authors: Chenzhong Yin, Zhihong Pan, Xin Zhou, Le Kang and Paul Bogdan
- Abstract summary: Recently, image rescaling models like IRN utilize the bidirectional nature of INN to push the performance limit of image upscaling.
We propose auxiliary encoding modules to further push the limit of image rescaling performance.
- Score: 7.9700865143145485
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Normalizing flow models using invertible neural networks (INN) have been
widely investigated for successful generative image super-resolution (SR) by
learning the transformation between the normal distribution of latent variable
$z$ and the conditional distribution of high-resolution (HR) images gave a
low-resolution (LR) input. Recently, image rescaling models like IRN utilize
the bidirectional nature of INN to push the performance limit of image
upscaling by optimizing the downscaling and upscaling steps jointly. While the
random sampling of latent variable $z$ is useful in generating diverse
photo-realistic images, it is not desirable for image rescaling when accurate
restoration of the HR image is more important. Hence, in places of random
sampling of $z$, we propose auxiliary encoding modules to further push the
limit of image rescaling performance. Two options to store the encoded latent
variables in downscaled LR images, both readily supported in existing image
file format, are proposed. One is saved as the alpha-channel, the other is
saved as meta-data in the image header, and the corresponding modules are
denoted as suffixes -A and -M respectively. Optimal network architectural
changes are investigated for both options to demonstrate their effectiveness in
raising the rescaling performance limit on different baseline models including
IRN and DLV-IRN.
Related papers
- Realistic Extreme Image Rescaling via Generative Latent Space Learning [51.85790402171696]
We propose a novel framework called Latent Space Based Image Rescaling (LSBIR) for extreme image rescaling tasks.
LSBIR effectively leverages powerful natural image priors learned by a pre-trained text-to-image diffusion model to generate realistic HR images.
In the first stage, a pseudo-invertible encoder-decoder models the bidirectional mapping between the latent features of the HR image and the target-sized LR image.
In the second stage, the reconstructed features from the first stage are refined by a pre-trained diffusion model to generate more faithful and visually pleasing details.
arXiv Detail & Related papers (2024-08-17T09:51:42Z) - Image-GS: Content-Adaptive Image Representation via 2D Gaussians [55.15950594752051]
We propose Image-GS, a content-adaptive image representation.
Using anisotropic 2D Gaussians as the basis, Image-GS shows high memory efficiency, supports fast random access, and offers a natural level of detail stack.
General efficiency and fidelity of Image-GS are validated against several recent neural image representations and industry-standard texture compressors.
We hope this research offers insights for developing new applications that require adaptive quality and resource control, such as machine perception, asset streaming, and content generation.
arXiv Detail & Related papers (2024-07-02T00:45:21Z) - Arbitrary-Scale Image Generation and Upsampling using Latent Diffusion Model and Implicit Neural Decoder [29.924160271522354]
Super-resolution (SR) and image generation are important tasks in computer vision and are widely adopted in real-world applications.
Most existing methods, however, generate images only at fixed-scale magnification and suffer from over-smoothing and artifacts.
Most relevant work applied Implicit Neural Representation (INR) to the denoising diffusion model to obtain continuous-resolution yet diverse and high-quality SR results.
We propose a novel pipeline that can super-resolve an input image or generate from a random noise a novel image at arbitrary scales.
arXiv Detail & Related papers (2024-03-15T12:45:40Z) - Beyond Learned Metadata-based Raw Image Reconstruction [86.1667769209103]
Raw images have distinct advantages over sRGB images, e.g., linearity and fine-grained quantization levels.
They are not widely adopted by general users due to their substantial storage requirements.
We propose a novel framework that learns a compact representation in the latent space, serving as metadata.
arXiv Detail & Related papers (2023-06-21T06:59:07Z) - Self-Asymmetric Invertible Network for Compression-Aware Image Rescaling [6.861753163565238]
In real-world applications, most images are compressed for transmission.
We propose the Self-Asymmetric Invertible Network (SAIN) for compression-aware image rescaling.
arXiv Detail & Related papers (2023-03-04T08:33:46Z) - Effective Invertible Arbitrary Image Rescaling [77.46732646918936]
Invertible Neural Networks (INN) are able to increase upscaling accuracy significantly by optimizing the downscaling and upscaling cycle jointly.
A simple and effective invertible arbitrary rescaling network (IARN) is proposed to achieve arbitrary image rescaling by training only one model in this work.
It is shown to achieve a state-of-the-art (SOTA) performance in bidirectional arbitrary rescaling without compromising perceptual quality in LR outputs.
arXiv Detail & Related papers (2022-09-26T22:22:30Z) - Enhancing Image Rescaling using Dual Latent Variables in Invertible
Neural Network [42.18106162158025]
A new downscaling latent variable is introduced to model variations in the image downscaling process.
It can improve image upscaling accuracy consistently without sacrificing image quality in downscaled LR images.
It is also shown to be effective in enhancing other INN-based models for image restoration applications like image hiding.
arXiv Detail & Related papers (2022-07-24T23:12:51Z) - LAPAR: Linearly-Assembled Pixel-Adaptive Regression Network for Single
Image Super-Resolution and Beyond [75.37541439447314]
Single image super-resolution (SISR) deals with a fundamental problem of upsampling a low-resolution (LR) image to its high-resolution (HR) version.
This paper proposes a linearly-assembled pixel-adaptive regression network (LAPAR) to strike a sweet spot of deep model complexity and resulting SISR quality.
arXiv Detail & Related papers (2021-05-21T15:47:18Z) - Super-Resolution of Real-World Faces [3.4376560669160394]
Real low-resolution (LR) face images contain degradations which are too varied and complex to be captured by known downsampling kernels.
In this paper, we propose a two module super-resolution network where the feature extractor module extracts robust features from the LR image.
We train a degradation GAN to convert bicubically downsampled clean images to real degraded images, and interpolate between the obtained degraded LR image and its clean LR counterpart.
arXiv Detail & Related papers (2020-11-04T17:25:54Z) - Locally Masked Convolution for Autoregressive Models [107.4635841204146]
LMConv is a simple modification to the standard 2D convolution that allows arbitrary masks to be applied to the weights at each location in the image.
We learn an ensemble of distribution estimators that share parameters but differ in generation order, achieving improved performance on whole-image density estimation.
arXiv Detail & Related papers (2020-06-22T17:59:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.