A Structure-Guided Diffusion Model for Large-Hole Image Completion
- URL: http://arxiv.org/abs/2211.10437v3
- Date: Wed, 6 Sep 2023 07:28:33 GMT
- Title: A Structure-Guided Diffusion Model for Large-Hole Image Completion
- Authors: Daichi Horita, Jiaolong Yang, Dong Chen, Yuki Koyama, Kiyoharu Aizawa,
Nicu Sebe
- Abstract summary: We develop a structure-guided diffusion model to fill large holes in images.
Our method achieves a superior or comparable visual quality compared to state-of-the-art approaches.
- Score: 85.61681358977266
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Image completion techniques have made significant progress in filling missing
regions (i.e., holes) in images. However, large-hole completion remains
challenging due to limited structural information. In this paper, we address
this problem by integrating explicit structural guidance into diffusion-based
image completion, forming our structure-guided diffusion model (SGDM). It
consists of two cascaded diffusion probabilistic models: structure and texture
generators. The structure generator generates an edge image representing
plausible structures within the holes, which is then used for guiding the
texture generation process. To train both generators jointly, we devise a novel
strategy that leverages optimal Bayesian denoising, which denoises the output
of the structure generator in a single step and thus allows backpropagation.
Our diffusion-based approach enables a diversity of plausible completions,
while the editable edges allow for editing parts of an image. Our experiments
on natural scene (Places) and face (CelebA-HQ) datasets demonstrate that our
method achieves a superior or comparable visual quality compared to
state-of-the-art approaches. The code is available for research purposes at
https://github.com/UdonDa/Structure_Guided_Diffusion_Model.
Related papers
- Coherent and Multi-modality Image Inpainting via Latent Space Optimization [61.99406669027195]
PILOT (intextbfPainting vtextbfIa textbfLatent textbfOptextbfTimization) is an optimization approach grounded on a novel textitsemantic centralization and textitbackground preservation loss.
Our method searches latent spaces capable of generating inpainted regions that exhibit high fidelity to user-provided prompts while maintaining coherence with the background.
arXiv Detail & Related papers (2024-07-10T19:58:04Z) - Coarse-to-Fine Latent Diffusion for Pose-Guided Person Image Synthesis [65.7968515029306]
We propose a novel Coarse-to-Fine Latent Diffusion (CFLD) method for Pose-Guided Person Image Synthesis (PGPIS)
A perception-refined decoder is designed to progressively refine a set of learnable queries and extract semantic understanding of person images as a coarse-grained prompt.
arXiv Detail & Related papers (2024-02-28T06:07:07Z) - Compressing Image-to-Image Translation GANs Using Local Density
Structures on Their Learned Manifold [69.33930972652594]
Generative Adversarial Networks (GANs) have shown remarkable success in modeling complex data distributions for image-to-image translation.
Existing GAN compression methods mainly rely on knowledge distillation or convolutional classifiers' pruning techniques.
We propose a new approach by explicitly encouraging the pruned model to preserve the density structure of the original parameter-heavy model on its learned manifold.
Our experiments on image translation GAN models, Pix2Pix and CycleGAN, with various benchmark datasets and architectures demonstrate our method's effectiveness.
arXiv Detail & Related papers (2023-12-22T15:43:12Z) - Semantic Image Translation for Repairing the Texture Defects of Building
Models [16.764719266178655]
We introduce a novel approach for synthesizing faccade texture images that authentically reflect the architectural style from a structured label map.
Our proposed method is also capable of synthesizing texture images with specific styles for faccades that lack pre-existing textures.
arXiv Detail & Related papers (2023-03-30T14:38:53Z) - Contour Completion using Deep Structural Priors [1.7399355670260819]
We present a framework that completes disconnected contours and connects fragmented lines and curves.
In our framework, we propose a model that does not even need to know which regions of the contour are eliminated.
Our work builds a robust framework to achieve contour completion using deep structural priors and extensively investigate how such a model could be implemented.
arXiv Detail & Related papers (2023-02-09T05:45:33Z) - Image Inpainting via Conditional Texture and Structure Dual Generation [26.97159780261334]
We propose a novel two-stream network for image inpainting, which models the structure-constrained texture synthesis and texture-guided structure reconstruction.
To enhance the global consistency, a Bi-directional Gated Feature Fusion (Bi-GFF) module is designed to exchange and combine the structure and texture information.
Experiments on the CelebA, Paris StreetView and Places2 datasets demonstrate the superiority of the proposed method.
arXiv Detail & Related papers (2021-08-22T15:44:37Z) - Generating Diverse Structure for Image Inpainting With Hierarchical
VQ-VAE [74.29384873537587]
We propose a two-stage model for diverse inpainting, where the first stage generates multiple coarse results each of which has a different structure, and the second stage refines each coarse result separately by augmenting texture.
Experimental results on CelebA-HQ, Places2, and ImageNet datasets show that our method not only enhances the diversity of the inpainting solutions but also improves the visual quality of the generated multiple images.
arXiv Detail & Related papers (2021-03-18T05:10:49Z) - Structural-analogy from a Single Image Pair [118.61885732829117]
In this paper, we explore the capabilities of neural networks to understand image structure given only a single pair of images, A and B.
We generate an image that keeps the appearance and style of B, but has a structural arrangement that corresponds to A.
Our method can be used to generate high quality imagery in other conditional generation tasks utilizing images A and B only.
arXiv Detail & Related papers (2020-04-05T14:51:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.