A Wasserstein GAN for Joint Learning of Inpainting and its Spatial
Optimisation
- URL: http://arxiv.org/abs/2202.05623v1
- Date: Fri, 11 Feb 2022 14:02:36 GMT
- Title: A Wasserstein GAN for Joint Learning of Inpainting and its Spatial
Optimisation
- Authors: Pascal Peter
- Abstract summary: We propose the first generative adversarial network for spatial inpainting data optimisation.
In contrast to previous approaches, it allows joint training of an inpainting generator and a corresponding mask network.
This yields significant improvements in visual quality and speed over conventional models and also outperforms current optimisation networks.
- Score: 3.4392739159262145
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Classic image inpainting is a restoration method that reconstructs missing
image parts. However, a carefully selected mask of known pixels that yield a
high quality inpainting can also act as a sparse image representation. This
challenging spatial optimisation problem is essential for practical
applications such as compression. So far, it has been almost exclusively
addressed by model-based approaches. First attempts with neural networks seem
promising, but are tailored towards specific inpainting operators or require
postprocessing. To address this issue, we propose the first generative
adversarial network for spatial inpainting data optimisation. In contrast to
previous approaches, it allows joint training of an inpainting generator and a
corresponding mask optimisation network. With a Wasserstein distance, we ensure
that our inpainting results accurately reflect the statistics of natural
images. This yields significant improvements in visual quality and speed over
conventional stochastic models and also outperforms current spatial
optimisation networks.
Related papers
- PrefPaint: Aligning Image Inpainting Diffusion Model with Human Preference [62.72779589895124]
We make the first attempt to align diffusion models for image inpainting with human aesthetic standards via a reinforcement learning framework.
We train a reward model with a dataset we construct, consisting of nearly 51,000 images annotated with human preferences.
Experiments on inpainting comparison and downstream tasks, such as image extension and 3D reconstruction, demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2024-10-29T11:49:39Z) - Realistic Extreme Image Rescaling via Generative Latent Space Learning [51.85790402171696]
We propose a novel framework called Latent Space Based Image Rescaling (LSBIR) for extreme image rescaling tasks.
LSBIR effectively leverages powerful natural image priors learned by a pre-trained text-to-image diffusion model to generate realistic HR images.
In the first stage, a pseudo-invertible encoder-decoder models the bidirectional mapping between the latent features of the HR image and the target-sized LR image.
In the second stage, the reconstructed features from the first stage are refined by a pre-trained diffusion model to generate more faithful and visually pleasing details.
arXiv Detail & Related papers (2024-08-17T09:51:42Z) - WavePaint: Resource-efficient Token-mixer for Self-supervised Inpainting [2.3014300466616078]
This paper diverges from vision transformers by using a computationally-efficient WaveMix-based fully convolutional architecture -- WavePaint.
It uses a 2D-discrete wavelet transform (DWT) for spatial and multi-resolution token-mixing along with convolutional layers.
Our model even outperforms current GAN-based architectures in CelebA-HQ dataset without using an adversarially trainable discriminator.
arXiv Detail & Related papers (2023-07-01T18:41:34Z) - T-former: An Efficient Transformer for Image Inpainting [50.43302925662507]
A class of attention-based network architectures, called transformer, has shown significant performance on natural language processing fields.
In this paper, we design a novel attention linearly related to the resolution according to Taylor expansion, and based on this attention, a network called $T$-former is designed for image inpainting.
Experiments on several benchmark datasets demonstrate that our proposed method achieves state-of-the-art accuracy while maintaining a relatively low number of parameters and computational complexity.
arXiv Detail & Related papers (2023-05-12T04:10:42Z) - High-Fidelity Image Inpainting with GAN Inversion [23.49170140410603]
In this paper, we propose a novel GAN inversion model for image inpainting, dubbed InvertFill.
Within the encoder, the pre-modulation network leverages multi-scale structures to encode more discriminative semantics into style vectors.
To reconstruct faithful and photorealistic images, a simple yet effective Soft-update Mean Latent module is designed to capture more diverse in-domain patterns that synthesize high-fidelity textures for large corruptions.
arXiv Detail & Related papers (2022-08-25T03:39:24Z) - Interactive Image Inpainting Using Semantic Guidance [36.34615403590834]
This paper develops a novel image inpainting approach that enables users to customize the inpainting result by their own preference or memory.
In the first stage, an autoencoder based on a novel external spatial attention mechanism is deployed to produce reconstructed features of the corrupted image.
In the second stage, a semantic decoder that takes the reconstructed features as prior is adopted to synthesize a fine inpainting result guided by user's customized semantic mask.
arXiv Detail & Related papers (2022-01-26T05:09:42Z) - Learning Sparse Masks for Diffusion-based Image Inpainting [10.633099921979674]
Diffusion-based inpainting is a powerful tool for the reconstruction of images from sparse data.
We provide a model for highly efficient adaptive mask generation.
Experiments indicate that our model can achieve competitive quality with an acceleration by as much as four orders of magnitude.
arXiv Detail & Related papers (2021-10-06T10:20:59Z) - Spatial-Separated Curve Rendering Network for Efficient and
High-Resolution Image Harmonization [59.19214040221055]
We propose a novel spatial-separated curve rendering network (S$2$CRNet) for efficient and high-resolution image harmonization.
The proposed method reduces more than 90% parameters compared with previous methods.
Our method can work smoothly on higher resolution images in real-time which is more than 10$times$ faster than the existing methods.
arXiv Detail & Related papers (2021-09-13T07:20:16Z) - Spatially-Adaptive Image Restoration using Distortion-Guided Networks [51.89245800461537]
We present a learning-based solution for restoring images suffering from spatially-varying degradations.
We propose SPAIR, a network design that harnesses distortion-localization information and dynamically adjusts to difficult regions in the image.
arXiv Detail & Related papers (2021-08-19T11:02:25Z) - R-MNet: A Perceptual Adversarial Network for Image Inpainting [5.471225956329675]
We propose a Wasserstein GAN combined with a new reverse mask operator, namely Reverse Masking Network (R-MNet), a perceptual adversarial network for image inpainting.
We show that our method is able to generalize to high-resolution inpainting task, and further show more realistic outputs that are plausible to the human visual system.
arXiv Detail & Related papers (2020-08-11T10:58:10Z) - Very Long Natural Scenery Image Prediction by Outpainting [96.8509015981031]
Outpainting receives less attention due to two challenges in it.
First challenge is how to keep the spatial and content consistency between generated images and original input.
Second challenge is how to maintain high quality in generated results.
arXiv Detail & Related papers (2019-12-29T16:29:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.