DeepGIN: Deep Generative Inpainting Network for Extreme Image Inpainting
- URL: http://arxiv.org/abs/2008.07173v1
- Date: Mon, 17 Aug 2020 09:30:28 GMT
- Title: DeepGIN: Deep Generative Inpainting Network for Extreme Image Inpainting
- Authors: Chu-Tak Li, Wan-Chi Siu, Zhi-Song Liu, Li-Wen Wang, and Daniel
Pak-Kong Lun
- Abstract summary: We propose a deep generative inpainting network, named DeepGIN, to handle various types of masked images.
Our model is capable of completing masked images in the wild.
- Score: 45.39552853543588
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The degree of difficulty in image inpainting depends on the types and sizes
of the missing parts. Existing image inpainting approaches usually encounter
difficulties in completing the missing parts in the wild with pleasing visual
and contextual results as they are trained for either dealing with one specific
type of missing patterns (mask) or unilaterally assuming the shapes and/or
sizes of the masked areas. We propose a deep generative inpainting network,
named DeepGIN, to handle various types of masked images. We design a Spatial
Pyramid Dilation (SPD) ResNet block to enable the use of distant features for
reconstruction. We also employ Multi-Scale Self-Attention (MSSA) mechanism and
Back Projection (BP) technique to enhance our inpainting results. Our DeepGIN
outperforms the state-of-the-art approaches generally, including two publicly
available datasets (FFHQ and Oxford Buildings), both quantitatively and
qualitatively. We also demonstrate that our model is capable of completing
masked images in the wild.
Related papers
- VIP: Versatile Image Outpainting Empowered by Multimodal Large Language Model [76.02314305164595]
This work presents a novel image outpainting framework that is capable of customizing the results according to the requirement of users.
We take advantage of a Multimodal Large Language Model (MLLM) that automatically extracts and organizes the corresponding textual descriptions of the masked and unmasked part of a given image.
In addition, a special Cross-Attention module, namely Center-Total-Surrounding (CTS), is elaborately designed to enhance further the the interaction between specific space regions of the image and corresponding parts of the text prompts.
arXiv Detail & Related papers (2024-06-03T07:14:19Z) - Sketch-guided Image Inpainting with Partial Discrete Diffusion Process [5.005162730122933]
We introduce a novel partial discrete diffusion process (PDDP) for sketch-guided inpainting.
PDDP corrupts the masked regions of the image and reconstructs these masked regions conditioned on hand-drawn sketches.
The proposed novel transformer module accepts two inputs -- the image containing the masked region to be inpainted and the query sketch to model the reverse diffusion process.
arXiv Detail & Related papers (2024-04-18T07:07:38Z) - Layered Depth Refinement with Mask Guidance [61.10654666344419]
We formulate a novel problem of mask-guided depth refinement that utilizes a generic mask to refine the depth prediction of SIDE models.
Our framework performs layered refinement and inpainting/outpainting, decomposing the depth map into two separate layers signified by the mask and the inverse mask.
We empirically show that our method is robust to different types of masks and initial depth predictions, accurately refining depth values in inner and outer mask boundary regions.
arXiv Detail & Related papers (2022-06-07T06:42:44Z) - Towards Reliable Image Outpainting: Learning Structure-Aware Multimodal
Fusion with Depth Guidance [49.94504248096527]
We propose a Depth-Guided Outpainting Network (DGONet) to model the feature representations of different modalities.
Two components are designed to implement: 1) The Multimodal Learning Module produces unique depth and RGB feature representations from perspectives of different modal characteristics.
We specially design an additional constraint strategy consisting of Cross-modal Loss and Edge Loss to enhance ambiguous contours and expedite reliable content generation.
arXiv Detail & Related papers (2022-04-12T06:06:50Z) - Image Inpainting with Edge-guided Learnable Bidirectional Attention Maps [85.67745220834718]
We present an edge-guided learnable bidirectional attention map (Edge-LBAM) for improving image inpainting of irregular holes.
Our Edge-LBAM method contains dual procedures,including structure-aware mask-updating guided by predict edges.
Extensive experiments show that our Edge-LBAM is effective in generating coherent image structures and preventing color discrepancy and blurriness.
arXiv Detail & Related papers (2021-04-25T07:25:16Z) - Free-Form Image Inpainting via Contrastive Attention Network [64.05544199212831]
In image inpainting tasks, masks with any shapes can appear anywhere in images which form complex patterns.
It is difficult for encoders to capture such powerful representations under this complex situation.
We propose a self-supervised Siamese inference network to improve the robustness and generalization.
arXiv Detail & Related papers (2020-10-29T14:46:05Z) - Deep Generative Model for Image Inpainting with Local Binary Pattern
Learning and Spatial Attention [28.807711307545112]
We propose a new end-to-end, two-stage (coarse-to-fine) generative model through combining a local binary pattern (LBP) learning network with an actual inpainting network.
Experiments on public datasets including CelebA-HQ, Places and Paris StreetView demonstrate that our model generates better inpainting results than the state-of-the-art competing algorithms.
arXiv Detail & Related papers (2020-09-02T12:59:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.