Related papers: HiWave: Training-Free High-Resolution Image Generation via Wavelet-Based Diffusion Sampling

HiWave: Training-Free High-Resolution Image Generation via Wavelet-Based Diffusion Sampling

URL: http://arxiv.org/abs/2506.20452v1
Date: Wed, 25 Jun 2025 13:58:37 GMT
Title: HiWave: Training-Free High-Resolution Image Generation via Wavelet-Based Diffusion Sampling
Authors: Tobias Vontobel, Seyedmorteza Sadat, Farnood Salehi, Romann M. Weber,
Abstract summary: HiWave is a training-free, zero-shot approach that substantially enhances visual fidelity and structural coherence in ultra-high-resolution image synthesis.<n>A user study confirmed HiWave's performance, where it was preferred over the state-of-the-art alternative in more than 80% of comparisons.
Score: 1.9474278832087901
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Diffusion models have emerged as the leading approach for image synthesis, demonstrating exceptional photorealism and diversity. However, training diffusion models at high resolutions remains computationally prohibitive, and existing zero-shot generation techniques for synthesizing images beyond training resolutions often produce artifacts, including object duplication and spatial incoherence. In this paper, we introduce HiWave, a training-free, zero-shot approach that substantially enhances visual fidelity and structural coherence in ultra-high-resolution image synthesis using pretrained diffusion models. Our method employs a two-stage pipeline: generating a base image from the pretrained model followed by a patch-wise DDIM inversion step and a novel wavelet-based detail enhancer module. Specifically, we first utilize inversion methods to derive initial noise vectors that preserve global coherence from the base image. Subsequently, during sampling, our wavelet-domain detail enhancer retains low-frequency components from the base image to ensure structural consistency, while selectively guiding high-frequency components to enrich fine details and textures. Extensive evaluations using Stable Diffusion XL demonstrate that HiWave effectively mitigates common visual artifacts seen in prior methods, achieving superior perceptual quality. A user study confirmed HiWave's performance, where it was preferred over the state-of-the-art alternative in more than 80% of comparisons, highlighting its effectiveness for high-quality, ultra-high-resolution image synthesis without requiring retraining or architectural modifications.

Related papers

Quaternion Wavelet-Conditioned Diffusion Models for Image Super-Resolution [7.986370916847687]
We introduce ResQu, a novel SR framework that integrates a quaternion wavelet preprocessing framework with latent diffusion models.<n>Our approach enhances the conditioning process by exploiting quaternion wavelet embeddings, which are dynamically integrated at different stages of denoising.<n>Our method achieves outstanding SR results, outperforming in many cases existing approaches in perceptual quality and standard evaluation metrics.
arXiv Detail & Related papers (2025-05-01T06:17:33Z)
A Hybrid Wavelet-Fourier Method for Next-Generation Conditional Diffusion Models [0.0]
We present a novel generative modeling framework,Wavelet-Fourier-Diffusion, which adapts the diffusion paradigm to hybrid frequency representations.<n>We show how the hybrid frequency-based representation improves control over global coherence and fine texture synthesis.
arXiv Detail & Related papers (2025-04-04T17:11:04Z)
MSF: Efficient Diffusion Model Via Multi-Scale Latent Factorize [18.73205699076486]
We introduce a diffusion framework leveraging multi-scale latent factorization.<n>Our framework decomposes the denoising target, typically latent features from a pretrained Variational Autoencoder, into a low-frequency base signal.<n>Our proposed architecture facilitates reduced sampling steps during the residual learning stage.
arXiv Detail & Related papers (2025-01-23T03:18:23Z)
Arbitrary-steps Image Super-resolution via Diffusion Inversion [68.78628844966019]
This study presents a new image super-resolution (SR) technique based on diffusion inversion, aiming at harnessing the rich image priors encapsulated in large pre-trained diffusion models to improve SR performance.<n>We design a Partial noise Prediction strategy to construct an intermediate state of the diffusion model, which serves as the starting sampling point.<n>Once trained, this noise predictor can be used to initialize the sampling process partially along the diffusion trajectory, generating the desirable high-resolution result.
arXiv Detail & Related papers (2024-12-12T07:24:13Z)
Oscillation Inversion: Understand the structure of Large Flow Model through the Lens of Inversion Method [60.88467353578118]
We show that a fixed-point-inspired iterative approach to invert real-world images does not achieve convergence, instead oscillating between distinct clusters. We introduce a simple and fast distribution transfer technique that facilitates image enhancement, stroke-based recoloring, as well as visual prompt-guided image editing.
arXiv Detail & Related papers (2024-11-17T17:45:37Z)
Effective Diffusion Transformer Architecture for Image Super-Resolution [63.254644431016345]
We design an effective diffusion transformer for image super-resolution (DiT-SR) In practice, DiT-SR leverages an overall U-shaped architecture, and adopts a uniform isotropic design for all the transformer blocks. We analyze the limitation of the widely used AdaLN, and present a frequency-adaptive time-step conditioning module.
arXiv Detail & Related papers (2024-09-29T07:14:16Z)
Timestep-Aware Diffusion Model for Extreme Image Rescaling [47.89362819768323]
We propose a novel framework called Timestep-Aware Diffusion Model (TADM) for extreme image rescaling.<n>TADM performs rescaling operations in the latent space of a pre-trained autoencoder.<n>It effectively leverages powerful natural image priors learned by a pre-trained text-to-image diffusion model.
arXiv Detail & Related papers (2024-08-17T09:51:42Z)
ScaleCrafter: Tuning-free Higher-Resolution Visual Generation with Diffusion Models [126.35334860896373]
We investigate the capability of generating images from pre-trained diffusion models at much higher resolutions than the training image sizes. Existing works for higher-resolution generation, such as attention-based and joint-diffusion approaches, cannot well address these issues. We propose a simple yet effective re-dilation that can dynamically adjust the convolutional perception field during inference.
arXiv Detail & Related papers (2023-10-11T17:52:39Z)
Stage-by-stage Wavelet Optimization Refinement Diffusion Model for Sparse-View CT Reconstruction [14.037398189132468]
We present an innovative approach named the Stage-by-stage Wavelet Optimization Refinement Diffusion (SWORD) model for sparse-view CT reconstruction. Specifically, we establish a unified mathematical model integrating low-frequency and high-frequency generative models, achieving the solution with optimization procedure. Our method rooted in established optimization theory, comprising three distinct stages, including low-frequency generation, high-frequency refinement and domain transform.
arXiv Detail & Related papers (2023-08-30T10:48:53Z)
ACDMSR: Accelerated Conditional Diffusion Models for Single Image Super-Resolution [84.73658185158222]
We propose a diffusion model-based super-resolution method called ACDMSR. Our method adapts the standard diffusion model to perform super-resolution through a deterministic iterative denoising process. Our approach generates more visually realistic counterparts for low-resolution images, emphasizing its effectiveness in practical scenarios.
arXiv Detail & Related papers (2023-07-03T06:49:04Z)
Low-Light Image Enhancement with Wavelet-based Diffusion Models [50.632343822790006]
Diffusion models have achieved promising results in image restoration tasks, yet suffer from time-consuming, excessive computational resource consumption, and unstable restoration. We propose a robust and efficient Diffusion-based Low-Light image enhancement approach, dubbed DiffLL.
arXiv Detail & Related papers (2023-06-01T03:08:28Z)
WaveDM: Wavelet-Based Diffusion Models for Image Restoration [43.254438752311714]
Wavelet-Based Diffusion Model (WaveDM) learns the distribution of clean images in the wavelet domain conditioned on the wavelet spectrum of degraded images after wavelet transform. WaveDM achieves state-of-the-art performance with the efficiency that is comparable to traditional one-pass methods.
arXiv Detail & Related papers (2023-05-23T08:41:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.