Related papers: SpotDiffusion: A Fast Approach For Seamless Panorama Generation Over Time

SpotDiffusion: A Fast Approach For Seamless Panorama Generation Over Time

URL: http://arxiv.org/abs/2407.15507v1
Date: Mon, 22 Jul 2024 09:44:35 GMT
Title: SpotDiffusion: A Fast Approach For Seamless Panorama Generation Over Time
Authors: Stanislav Frolov, Brian B. Moser, Andreas Dengel,
Abstract summary: We present a novel approach to generate high-resolution images with generative models. Our method shifts non-overlapping denoising windows over time, ensuring that seams in one timestep are corrected in the next. Our method offers several key benefits, including improved computational efficiency and faster inference times.
Score: 7.532695984765271
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Generating high-resolution images with generative models has recently been made widely accessible by leveraging diffusion models pre-trained on large-scale datasets. Various techniques, such as MultiDiffusion and SyncDiffusion, have further pushed image generation beyond training resolutions, i.e., from square images to panorama, by merging multiple overlapping diffusion paths or employing gradient descent to maintain perceptual coherence. However, these methods suffer from significant computational inefficiencies due to generating and averaging numerous predictions, which is required in practice to produce high-quality and seamless images. This work addresses this limitation and presents a novel approach that eliminates the need to generate and average numerous overlapping denoising predictions. Our method shifts non-overlapping denoising windows over time, ensuring that seams in one timestep are corrected in the next. This results in coherent, high-resolution images with fewer overall steps. We demonstrate the effectiveness of our approach through qualitative and quantitative evaluations, comparing it with MultiDiffusion, SyncDiffusion, and StitchDiffusion. Our method offers several key benefits, including improved computational efficiency and faster inference times while producing comparable or better image quality.

Related papers

Reusing Computation in Text-to-Image Diffusion for Efficient Generation of Image Sets [19.950913420708734]
We propose a training-free approach that clusters prompts based on semantic similarity and shares in early diffusion steps.<n>Our method seamlessly integrates with existing pipelines, scales with prompt sets, and reduces the environmental and financial burden of large-scale text-to-image generation.
arXiv Detail & Related papers (2025-08-28T17:35:03Z)
Efficient Dataset Distillation via Diffusion-Driven Patch Selection for Improved Generalization [34.53986517177061]
We propose a novel framework to existing diffusion-based distillation methods, leveraging diffusion models for selection rather than generation. Our method starts by predicting noise generated by the diffusion model based on input images and text prompts, then calculates the corresponding loss for each pair. This streamlined framework enables a single-step distillation process, and extensive experiments demonstrate that our approach outperforms state-of-the-art methods across various metrics.
arXiv Detail & Related papers (2024-12-13T08:34:46Z)
Fast constrained sampling in pre-trained diffusion models [77.21486516041391]
Diffusion models have dominated the field of large, generative image models. We propose an algorithm for fast-constrained sampling in large pre-trained diffusion models.
arXiv Detail & Related papers (2024-10-24T14:52:38Z)
A Wavelet Diffusion GAN for Image Super-Resolution [7.986370916847687]
Diffusion models have emerged as a superior alternative to generative adversarial networks (GANs) for high-fidelity image generation. However, their real-time feasibility is hindered by slow training and inference speeds. This study proposes a wavelet-based conditional Diffusion GAN scheme for Single-Image Super-Resolution.
arXiv Detail & Related papers (2024-10-23T15:34:06Z)
ReNoise: Real Image Inversion Through Iterative Noising [62.96073631599749]
We introduce an inversion method with a high quality-to-operation ratio, enhancing reconstruction accuracy without increasing the number of operations. We evaluate the performance of our ReNoise technique using various sampling algorithms and models, including recent accelerated diffusion models.
arXiv Detail & Related papers (2024-03-21T17:52:08Z)
Mitigating Data Consistency Induced Discrepancy in Cascaded Diffusion Models for Sparse-view CT Reconstruction [4.227116189483428]
This study introduces a novel Cascaded Diffusion with Discrepancy Mitigation framework. It includes the low-quality image generation in latent space and the high-quality image generation in pixel space. It minimizes computational costs by moving some inference steps from pixel space to latent space.
arXiv Detail & Related papers (2024-03-14T12:58:28Z)
Efficient Diffusion Model for Image Restoration by Residual Shifting [63.02725947015132]
This study proposes a novel and efficient diffusion model for image restoration. Our method avoids the need for post-acceleration during inference, thereby avoiding the associated performance deterioration. Our method achieves superior or comparable performance to current state-of-the-art methods on three classical IR tasks.
arXiv Detail & Related papers (2024-03-12T05:06:07Z)
Decoupled Data Consistency with Diffusion Purification for Image Restoration [15.043002968696978]
We propose a novel diffusion-based image restoration solver that addresses issues by decoupling the reverse process from the data consistency steps. Our approach demonstrates versatility, making it highly adaptable for efficient problem-solving in latent space. The efficacy of our approach is validated through comprehensive experiments across various image restoration tasks, including image denoising, deblurring, inpainting, and super-resolution.
arXiv Detail & Related papers (2024-03-10T00:47:05Z)
AdaDiff: Adaptive Step Selection for Fast Diffusion Models [82.78899138400435]
We introduce AdaDiff, a lightweight framework designed to learn instance-specific step usage policies. AdaDiff is optimized using a policy method to maximize a carefully designed reward function. We conduct experiments on three image generation and two video generation benchmarks and demonstrate that our approach achieves similar visual quality compared to the baseline.
arXiv Detail & Related papers (2023-11-24T11:20:38Z)
CoDi: Conditional Diffusion Distillation for Higher-Fidelity and Faster Image Generation [49.3016007471979]
Large generative diffusion models have revolutionized text-to-image generation and offer immense potential for conditional generation tasks. However, their widespread adoption is hindered by the high computational cost, which limits their real-time application. We introduce a novel method dubbed CoDi, that adapts a pre-trained latent diffusion model to accept additional image conditioning inputs.
arXiv Detail & Related papers (2023-10-02T17:59:18Z)
ResShift: Efficient Diffusion Model for Image Super-resolution by Residual Shifting [70.83632337581034]
Diffusion-based image super-resolution (SR) methods are mainly limited by the low inference speed. We propose a novel and efficient diffusion model for SR that significantly reduces the number of diffusion steps. Our method constructs a Markov chain that transfers between the high-resolution image and the low-resolution image by shifting the residual.
arXiv Detail & Related papers (2023-07-23T15:10:02Z)
Simultaneous Image-to-Zero and Zero-to-Noise: Diffusion Models with Analytical Image Attenuation [53.04220377034574]
We propose incorporating an analytical image attenuation process into the forward diffusion process for high-quality (un)conditioned image generation. Our method represents the forward image-to-noise mapping as simultaneous textitimage-to-zero mapping and textitzero-to-noise mapping. We have conducted experiments on unconditioned image generation, textite.g., CIFAR-10 and CelebA-HQ-256, and image-conditioned downstream tasks such as super-resolution, saliency detection, edge detection, and image inpainting.
arXiv Detail & Related papers (2023-06-23T18:08:00Z)
BOOT: Data-free Distillation of Denoising Diffusion Models with Bootstrapping [64.54271680071373]
Diffusion models have demonstrated excellent potential for generating diverse images. Knowledge distillation has been recently proposed as a remedy that can reduce the number of inference steps to one or a few. We present a novel technique called BOOT, that overcomes limitations with an efficient data-free distillation algorithm.
arXiv Detail & Related papers (2023-06-08T20:30:55Z)
Low-Light Image Enhancement with Wavelet-based Diffusion Models [50.632343822790006]
Diffusion models have achieved promising results in image restoration tasks, yet suffer from time-consuming, excessive computational resource consumption, and unstable restoration. We propose a robust and efficient Diffusion-based Low-Light image enhancement approach, dubbed DiffLL.
arXiv Detail & Related papers (2023-06-01T03:08:28Z)
Nested Diffusion Processes for Anytime Image Generation [38.84966342097197]
We propose an anytime diffusion-based method that can generate viable images when stopped at arbitrary times before completion. In experiments on ImageNet and Stable Diffusion-based text-to-image generation, we show, both qualitatively and quantitatively, that our method's intermediate generation quality greatly exceeds that of the original diffusion model.
arXiv Detail & Related papers (2023-05-30T14:28:43Z)
Conffusion: Confidence Intervals for Diffusion Models [32.36217153362305]
Current diffusion-based methods do not provide statistical guarantees regarding the generated results. We propose Conffusion, wherein we fine-tune a pre-trained diffusion model to predict interval bounds in a single forward pass. We show that Conffusion outperforms the baseline method while being three orders of magnitude faster.
arXiv Detail & Related papers (2022-11-17T18:58:15Z)
Pyramidal Denoising Diffusion Probabilistic Models [43.9925721757248]
We present a novel pyramidal diffusion model to generate high resolution images using a single score function trained with a positional embedding. This enables a time-efficient sampling for image generation, and also solves the low batch size problem when training with limited resources.
arXiv Detail & Related papers (2022-08-03T06:26:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.