Pyramidal Denoising Diffusion Probabilistic Models
- URL: http://arxiv.org/abs/2208.01864v1
- Date: Wed, 3 Aug 2022 06:26:18 GMT
- Title: Pyramidal Denoising Diffusion Probabilistic Models
- Authors: Dohoon Ryu, Jong Chul Ye
- Abstract summary: We present a novel pyramidal diffusion model to generate high resolution images using a single score function trained with a positional embedding.
This enables a time-efficient sampling for image generation, and also solves the low batch size problem when training with limited resources.
- Score: 43.9925721757248
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Diffusion models have demonstrated impressive image generation performance,
and have been used in various computer vision tasks. Unfortunately, image
generation using diffusion models is very time-consuming since it requires
thousands of sampling steps. To address this problem, here we present a novel
pyramidal diffusion model to generate high resolution images starting from much
coarser resolution images using a single score function trained with a
positional embedding. This enables a time-efficient sampling for image
generation, and also solves the low batch size problem when training with
limited resources. Furthermore, we show that the proposed approach can be
efficiently used for multi-scale super-resolution problem using a single score
function.
Related papers
- Multi-Feature Aggregation in Diffusion Models for Enhanced Face Super-Resolution [6.055006354743854]
We develop an algorithm that utilize a low-resolution image combined with features extracted from multiple low-quality images to generate a super-resolved image.
Unlike other algorithms, our approach recovers facial features without explicitly providing attribute information.
This is the first time multi-features combined with low-resolution images are used as conditioners to generate more reliable super-resolution images.
arXiv Detail & Related papers (2024-08-27T20:08:33Z) - SpotDiffusion: A Fast Approach For Seamless Panorama Generation Over Time [7.532695984765271]
We present a novel approach to generate high-resolution images with generative models.
Our method shifts non-overlapping denoising windows over time, ensuring that seams in one timestep are corrected in the next.
Our method offers several key benefits, including improved computational efficiency and faster inference times.
arXiv Detail & Related papers (2024-07-22T09:44:35Z) - Efficient Conditional Diffusion Model with Probability Flow Sampling for Image Super-resolution [35.55094110634178]
We propose an efficient conditional diffusion model with probability flow sampling for image super-resolution.
Our method achieves higher super-resolution quality than existing diffusion-based image super-resolution methods.
arXiv Detail & Related papers (2024-04-16T16:08:59Z) - ScaleCrafter: Tuning-free Higher-Resolution Visual Generation with
Diffusion Models [126.35334860896373]
We investigate the capability of generating images from pre-trained diffusion models at much higher resolutions than the training image sizes.
Existing works for higher-resolution generation, such as attention-based and joint-diffusion approaches, cannot well address these issues.
We propose a simple yet effective re-dilation that can dynamically adjust the convolutional perception field during inference.
arXiv Detail & Related papers (2023-10-11T17:52:39Z) - Accelerating Guided Diffusion Sampling with Splitting Numerical Methods [8.689906452450938]
Recent techniques can accelerate unguided sampling by applying high-order numerical methods to the sampling process.
This paper explores the culprit of this problem and provides a solution based on operator splitting methods.
Our proposed method can re-utilize the high-order methods for guided sampling and can generate images with the same quality as a 250-step DDIM baseline.
arXiv Detail & Related papers (2023-01-27T06:48:29Z) - Super-resolution Reconstruction of Single Image for Latent features [8.857209365343646]
Single-image super-resolution (SISR) typically focuses on restoring various degraded low-resolution (LR) images to a single high-resolution (HR) image.
It is often challenging for models to simultaneously maintain high quality and rapid sampling while preserving diversity in details and texture features.
This challenge can lead to issues such as model collapse, lack of rich details and texture features in the reconstructed HR images, and excessive time consumption for model sampling.
arXiv Detail & Related papers (2022-11-16T09:37:07Z) - Markup-to-Image Diffusion Models with Scheduled Sampling [111.30188533324954]
Building on recent advances in image generation, we present a data-driven approach to rendering markup into images.
The approach is based on diffusion models, which parameterize the distribution of data using a sequence of denoising operations.
We conduct experiments on four markup datasets: mathematical formulas (La), table layouts (HTML), sheet music (LilyPond), and molecular images (SMILES)
arXiv Detail & Related papers (2022-10-11T04:56:12Z) - Image Generation with Multimodal Priors using Denoising Diffusion
Probabilistic Models [54.1843419649895]
A major challenge in using generative models to accomplish this task is the lack of paired data containing all modalities and corresponding outputs.
We propose a solution based on a denoising diffusion probabilistic synthesis models to generate images under multi-model priors.
arXiv Detail & Related papers (2022-06-10T12:23:05Z) - Dynamic Dual-Output Diffusion Models [100.32273175423146]
Iterative denoising-based generation has been shown to be comparable in quality to other classes of generative models.
A major drawback of this method is that it requires hundreds of iterations to produce a competitive result.
Recent works have proposed solutions that allow for faster generation with fewer iterations, but the image quality gradually deteriorates.
arXiv Detail & Related papers (2022-03-08T11:20:40Z) - Denoising Diffusion Restoration Models [110.1244240726802]
Denoising Diffusion Restoration Models (DDRM) is an efficient, unsupervised posterior sampling method.
We demonstrate DDRM's versatility on several image datasets for super-resolution, deblurring, inpainting, and colorization.
arXiv Detail & Related papers (2022-01-27T20:19:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.