Adversarial Generation of Continuous Images
- URL: http://arxiv.org/abs/2011.12026v2
- Date: Mon, 28 Jun 2021 09:00:05 GMT
- Title: Adversarial Generation of Continuous Images
- Authors: Ivan Skorokhodov, Savva Ignatyev, Mohamed Elhoseiny
- Abstract summary: In this paper, we propose two novel architectural techniques for building INR-based image decoders.
We use them to build a state-of-the-art continuous image GAN.
Our proposed INR-GAN architecture improves the performance of continuous image generators by several times.
- Score: 31.92891885615843
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In most existing learning systems, images are typically viewed as 2D pixel
arrays. However, in another paradigm gaining popularity, a 2D image is
represented as an implicit neural representation (INR) - an MLP that predicts
an RGB pixel value given its (x,y) coordinate. In this paper, we propose two
novel architectural techniques for building INR-based image decoders:
factorized multiplicative modulation and multi-scale INRs, and use them to
build a state-of-the-art continuous image GAN. Previous attempts to adapt INRs
for image generation were limited to MNIST-like datasets and do not scale to
complex real-world data. Our proposed INR-GAN architecture improves the
performance of continuous image generators by several times, greatly reducing
the gap between continuous image GANs and pixel-based ones. Apart from that, we
explore several exciting properties of the INR-based decoders, like
out-of-the-box superresolution, meaningful image-space interpolation,
accelerated inference of low-resolution images, an ability to extrapolate
outside of image boundaries, and strong geometric prior. The project page is
located at https://universome.github.io/inr-gan.
Related papers
- Image-GS: Content-Adaptive Image Representation via 2D Gaussians [55.15950594752051]
We propose Image-GS, a content-adaptive image representation.
Using anisotropic 2D Gaussians as the basis, Image-GS shows high memory efficiency, supports fast random access, and offers a natural level of detail stack.
General efficiency and fidelity of Image-GS are validated against several recent neural image representations and industry-standard texture compressors.
We hope this research offers insights for developing new applications that require adaptive quality and resource control, such as machine perception, asset streaming, and content generation.
arXiv Detail & Related papers (2024-07-02T00:45:21Z) - Beyond Learned Metadata-based Raw Image Reconstruction [86.1667769209103]
Raw images have distinct advantages over sRGB images, e.g., linearity and fine-grained quantization levels.
They are not widely adopted by general users due to their substantial storage requirements.
We propose a novel framework that learns a compact representation in the latent space, serving as metadata.
arXiv Detail & Related papers (2023-06-21T06:59:07Z) - T-former: An Efficient Transformer for Image Inpainting [50.43302925662507]
A class of attention-based network architectures, called transformer, has shown significant performance on natural language processing fields.
In this paper, we design a novel attention linearly related to the resolution according to Taylor expansion, and based on this attention, a network called $T$-former is designed for image inpainting.
Experiments on several benchmark datasets demonstrate that our proposed method achieves state-of-the-art accuracy while maintaining a relatively low number of parameters and computational complexity.
arXiv Detail & Related papers (2023-05-12T04:10:42Z) - Polynomial Implicit Neural Representations For Large Diverse Datasets [0.0]
Implicit neural representations (INR) have gained significant popularity for signal and image representation.
Most INR architectures rely on sinusoidal positional encoding, which accounts for high-frequency information in data.
Our approach addresses this gap by representing an image with a function and eliminates the need for positional encodings.
The proposed Poly-INR model performs comparably to state-of-the-art generative models without any convolution, normalization, or self-attention.
arXiv Detail & Related papers (2023-03-20T20:09:46Z) - Dense Pixel-to-Pixel Harmonization via Continuous Image Representation [22.984119094424056]
We propose a novel image Harmonization method based on Implicit neural Networks (HINet)
Inspired by the Retinex theory, we decouple the harmonizations into two parts to respectively capture the content and environment of composite images.
Extensive experiments have demonstrated the effectiveness of our method compared with state-of-the-art methods.
arXiv Detail & Related papers (2023-03-03T02:52:28Z) - Memory Efficient Patch-based Training for INR-based GANs [13.19626131847784]
Training existing approaches require a heavy computational cost proportional to the image resolution.
We propose a multi-stage patch-based training, a novel and scalable approach that can train INR-based GANs with a flexible computational cost.
Specifically, our method allows to generate and discriminate by patch to learn the local details of the image and learn global structural information.
arXiv Detail & Related papers (2022-07-04T13:28:53Z) - Image Compression with Recurrent Neural Network and Generalized Divisive
Normalization [3.0204520109309843]
Deep learning has gained huge attention from the research community and produced promising image reconstruction results.
Recent methods focused on developing deeper and more complex networks, which significantly increased network complexity.
In this paper, two effective novel blocks are developed: analysis and block synthesis that employs the convolution layer and Generalized Divisive Normalization (GDN) in the variable-rate encoder and decoder side.
arXiv Detail & Related papers (2021-09-05T05:31:55Z) - UltraSR: Spatial Encoding is a Missing Key for Implicit Image
Function-based Arbitrary-Scale Super-Resolution [74.82282301089994]
In this work, we propose UltraSR, a simple yet effective new network design based on implicit image functions.
We show that spatial encoding is indeed a missing key towards the next-stage high-accuracy implicit image function.
Our UltraSR sets new state-of-the-art performance on the DIV2K benchmark under all super-resolution scales.
arXiv Detail & Related papers (2021-03-23T17:36:42Z) - Locally Masked Convolution for Autoregressive Models [107.4635841204146]
LMConv is a simple modification to the standard 2D convolution that allows arbitrary masks to be applied to the weights at each location in the image.
We learn an ensemble of distribution estimators that share parameters but differ in generation order, achieving improved performance on whole-image density estimation.
arXiv Detail & Related papers (2020-06-22T17:59:07Z) - Image Fine-grained Inpainting [89.17316318927621]
We present a one-stage model that utilizes dense combinations of dilated convolutions to obtain larger and more effective receptive fields.
To better train this efficient generator, except for frequently-used VGG feature matching loss, we design a novel self-guided regression loss.
We also employ a discriminator with local and global branches to ensure local-global contents consistency.
arXiv Detail & Related papers (2020-02-07T03:45:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.