Related papers: SDM: Spatial Diffusion Model for Large Hole Image Inpainting

SDM: Spatial Diffusion Model for Large Hole Image Inpainting

URL: http://arxiv.org/abs/2212.02963v1
Date: Tue, 6 Dec 2022 13:30:18 GMT
Title: SDM: Spatial Diffusion Model for Large Hole Image Inpainting
Authors: Wenbo Li, Xin Yu, Kun Zhou, Yibing Song, Zhe Lin, Jiaya Jia
Abstract summary: We present a novel spatial diffusion model (SDM) that uses a few iterations to gradually deliver informative pixels to the entire image. Also, thanks to the proposed decoupled probabilistic modeling and spatial diffusion scheme, our method achieves high-quality large-hole completion.
Score: 106.90795513361498
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Generative adversarial networks (GANs) have made great success in image inpainting yet still have difficulties tackling large missing regions. In contrast, iterative algorithms, such as autoregressive and denoising diffusion models, have to be deployed with massive computing resources for decent effect. To overcome the respective limitations, we present a novel spatial diffusion model (SDM) that uses a few iterations to gradually deliver informative pixels to the entire image, largely enhancing the inference efficiency. Also, thanks to the proposed decoupled probabilistic modeling and spatial diffusion scheme, our method achieves high-quality large-hole completion. On multiple benchmarks, we achieve new state-of-the-art performance. Code is released at https://github.com/fenglinglwb/SDM.

Related papers

SEM-Net: Efficient Pixel Modelling for image inpainting with Spatially Enhanced SSM [11.447968918063335]
Image inpainting aims to repair a partially damaged image based on the information from known regions of the images. SEM-Net is a novel visual State Space model (SSM) vision network, modelling corrupted images at the pixel level while capturing long-range dependencies (LRDs) in state space.
arXiv Detail & Related papers (2024-11-10T00:35:14Z)
Fast constrained sampling in pre-trained diffusion models [77.21486516041391]
We propose an algorithm that enables fast and high-quality generation under arbitrary constraints. During inference, we can interchange between gradient updates computed on the noisy image and updates computed on the final, clean image. Our approach produces results that rival or surpass the state-of-the-art training-free inference approaches.
arXiv Detail & Related papers (2024-10-24T14:52:38Z)
SpotDiffusion: A Fast Approach For Seamless Panorama Generation Over Time [7.532695984765271]
We present a novel approach to generate high-resolution images with generative models. Our method shifts non-overlapping denoising windows over time, ensuring that seams in one timestep are corrected in the next. Our method offers several key benefits, including improved computational efficiency and faster inference times.
arXiv Detail & Related papers (2024-07-22T09:44:35Z)
$\infty$-Brush: Controllable Large Image Synthesis with Diffusion Models in Infinite Dimensions [58.42011190989414]
We introduce a novel conditional diffusion model in infinite dimensions, $infty$-Brush for controllable large image synthesis. To our best knowledge, $infty$-Brush is the first conditional diffusion model in function space, that can controllably synthesize images at arbitrary resolutions of up to $4096times4096$ pixels.
arXiv Detail & Related papers (2024-07-20T00:04:49Z)
Coherent and Multi-modality Image Inpainting via Latent Space Optimization [61.99406669027195]
PILOT (intextbfPainting vtextbfIa textbfLatent textbfOptextbfTimization) is an optimization approach grounded on a novel textitsemantic centralization and textitbackground preservation loss. Our method searches latent spaces capable of generating inpainted regions that exhibit high fidelity to user-provided prompts while maintaining coherence with the background.
arXiv Detail & Related papers (2024-07-10T19:58:04Z)
Arbitrary-Scale Image Generation and Upsampling using Latent Diffusion Model and Implicit Neural Decoder [29.924160271522354]
Super-resolution (SR) and image generation are important tasks in computer vision and are widely adopted in real-world applications. Most existing methods, however, generate images only at fixed-scale magnification and suffer from over-smoothing and artifacts. Most relevant work applied Implicit Neural Representation (INR) to the denoising diffusion model to obtain continuous-resolution yet diverse and high-quality SR results. We propose a novel pipeline that can super-resolve an input image or generate from a random noise a novel image at arbitrary scales.
arXiv Detail & Related papers (2024-03-15T12:45:40Z)
Domain Transfer in Latent Space (DTLS) Wins on Image Super-Resolution -- a Non-Denoising Model [13.326634982790528]
We propose a simple approach which gets away from using Gaussian noise but adopts some basic structures of diffusion models for efficient image super-resolution. Experimental results show that our method outperforms not only state-of-the-art large scale super resolution models, but also the current diffusion models for image super-resolution.
arXiv Detail & Related papers (2023-11-04T09:57:50Z)
Denoising Diffusion Bridge Models [54.87947768074036]
Diffusion models are powerful generative models that map noise to data using processes. For many applications such as image editing, the model input comes from a distribution that is not random noise. In our work, we propose Denoising Diffusion Bridge Models (DDBMs)
arXiv Detail & Related papers (2023-09-29T03:24:24Z)
Hierarchical Integration Diffusion Model for Realistic Image Deblurring [71.76410266003917]
Diffusion models (DMs) have been introduced in image deblurring and exhibited promising performance. We propose the Hierarchical Integration Diffusion Model (HI-Diff), for realistic image deblurring. Experiments on synthetic and real-world blur datasets demonstrate that our HI-Diff outperforms state-of-the-art methods.
arXiv Detail & Related papers (2023-05-22T12:18:20Z)
Image Fine-grained Inpainting [89.17316318927621]
We present a one-stage model that utilizes dense combinations of dilated convolutions to obtain larger and more effective receptive fields. To better train this efficient generator, except for frequently-used VGG feature matching loss, we design a novel self-guided regression loss. We also employ a discriminator with local and global branches to ensure local-global contents consistency.
arXiv Detail & Related papers (2020-02-07T03:45:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.