V-LinkNet: Learning Contextual Inpainting Across Latent Space of
Generative Adversarial Network
- URL: http://arxiv.org/abs/2201.00323v1
- Date: Sun, 2 Jan 2022 09:14:23 GMT
- Title: V-LinkNet: Learning Contextual Inpainting Across Latent Space of
Generative Adversarial Network
- Authors: Jireh Jam, Connah Kendrick, Vincent Drouard, Kevin Walker, Moi Hoon
Yap
- Abstract summary: We propose the V-LinkNet cross-space learning strategy network to improve learning on contextualised features.
We compare inpainting performance on the same face with different masks and on different faces with the same masks.
Our result surpasses the state of the art when evaluated on the CelebA-HQ with the standard protocol.
- Score: 7.5089719291325325
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep learning methods outperform traditional methods in image inpainting. In
order to generate contextual textures, researchers are still working to improve
on existing methods and propose models that can extract, propagate, and
reconstruct features similar to ground-truth regions. Furthermore, the lack of
a high-quality feature transfer mechanism in deeper layers contributes to
persistent aberrations on generated inpainted regions. To address these
limitations, we propose the V-LinkNet cross-space learning strategy network. To
improve learning on contextualised features, we design a loss model that
employs both encoders. In addition, we propose a recursive residual transition
layer (RSTL). The RSTL extracts high-level semantic information and propagates
it down layers. Finally, we compare inpainting performance on the same face
with different masks and on different faces with the same masks. To improve
image inpainting reproducibility, we propose a standard protocol to overcome
biases with various masks and images. We investigate the V-LinkNet components
using experimental methods. Our result surpasses the state of the art when
evaluated on the CelebA-HQ with the standard protocol. In addition, our model
can generalise well when evaluated on Paris Street View, and Places2 datasets
with the standard protocol.
Related papers
- BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed
Dual-Branch Diffusion [61.90969199199739]
BrushNet is a novel plug-and-play dual-branch model engineered to embed pixel-level masked image features into any pre-trained DM.
BrushNet's superior performance over existing models across seven key metrics, including image quality, mask region preservation, and textual coherence.
arXiv Detail & Related papers (2024-03-11T17:59:31Z) - ENTED: Enhanced Neural Texture Extraction and Distribution for
Reference-based Blind Face Restoration [51.205673783866146]
We present ENTED, a new framework for blind face restoration that aims to restore high-quality and realistic portrait images.
We utilize a texture extraction and distribution framework to transfer high-quality texture features between the degraded input and reference image.
The StyleGAN-like architecture in our framework requires high-quality latent codes to generate realistic images.
arXiv Detail & Related papers (2024-01-13T04:54:59Z) - Semantic Image Synthesis via Class-Adaptive Cross-Attention [7.147779225315707]
Cross-attention layers are used in place of SPADE for learning shape-style correlations and so conditioning the image generation process.
Our model inherits the versatility of SPADE, at the same time obtaining state-of-the-art generation quality, as well as improved global and local style transfer.
arXiv Detail & Related papers (2023-08-30T14:49:34Z) - Diverse Inpainting and Editing with GAN Inversion [4.234367850767171]
Recent inversion methods have shown that real images can be inverted into StyleGAN's latent space.
In this paper, we tackle an even more difficult task, inverting erased images into GAN's latent space for realistic inpaintings and editings.
arXiv Detail & Related papers (2023-07-27T17:41:36Z) - Parents and Children: Distinguishing Multimodal DeepFakes from Natural Images [60.34381768479834]
Recent advancements in diffusion models have enabled the generation of realistic deepfakes from textual prompts in natural language.
We pioneer a systematic study on deepfake detection generated by state-of-the-art diffusion models.
arXiv Detail & Related papers (2023-04-02T10:25:09Z) - Modeling Image Composition for Complex Scene Generation [77.10533862854706]
We present a method that achieves state-of-the-art results on layout-to-image generation tasks.
After compressing RGB images into patch tokens, we propose the Transformer with Focal Attention (TwFA) for exploring dependencies of object-to-object, object-to-patch and patch-to-patch.
arXiv Detail & Related papers (2022-06-02T08:34:25Z) - A Wasserstein GAN for Joint Learning of Inpainting and its Spatial
Optimisation [3.4392739159262145]
We propose the first generative adversarial network for spatial inpainting data optimisation.
In contrast to previous approaches, it allows joint training of an inpainting generator and a corresponding mask network.
This yields significant improvements in visual quality and speed over conventional models and also outperforms current optimisation networks.
arXiv Detail & Related papers (2022-02-11T14:02:36Z) - FT-TDR: Frequency-guided Transformer and Top-Down Refinement Network for
Blind Face Inpainting [77.78305705925376]
Blind face inpainting refers to the task of reconstructing visual contents without explicitly indicating the corrupted regions in a face image.
We propose a novel two-stage blind face inpainting method named Frequency-guided Transformer and Top-Down Refinement Network (FT-TDR) to tackle these challenges.
arXiv Detail & Related papers (2021-08-10T03:12:01Z) - Image Inpainting with Edge-guided Learnable Bidirectional Attention Maps [85.67745220834718]
We present an edge-guided learnable bidirectional attention map (Edge-LBAM) for improving image inpainting of irregular holes.
Our Edge-LBAM method contains dual procedures,including structure-aware mask-updating guided by predict edges.
Extensive experiments show that our Edge-LBAM is effective in generating coherent image structures and preventing color discrepancy and blurriness.
arXiv Detail & Related papers (2021-04-25T07:25:16Z) - Deep Generative Model for Image Inpainting with Local Binary Pattern
Learning and Spatial Attention [28.807711307545112]
We propose a new end-to-end, two-stage (coarse-to-fine) generative model through combining a local binary pattern (LBP) learning network with an actual inpainting network.
Experiments on public datasets including CelebA-HQ, Places and Paris StreetView demonstrate that our model generates better inpainting results than the state-of-the-art competing algorithms.
arXiv Detail & Related papers (2020-09-02T12:59:28Z) - Enhanced Residual Networks for Context-based Image Outpainting [0.0]
Deep models struggle to understand context and extrapolation through retained information.
Current models use generative adversarial networks to generate results which lack localized image feature consistency and appear fake.
We propose two methods to improve this issue: the use of a local and global discriminator, and the addition of residual blocks within the encoding section of the network.
arXiv Detail & Related papers (2020-05-14T05:14:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.