Related papers: Spectral Collapse in Diffusion Inversion

Spectral Collapse in Diffusion Inversion

URL: http://arxiv.org/abs/2602.13303v1
Date: Mon, 09 Feb 2026 17:53:21 GMT
Title: Spectral Collapse in Diffusion Inversion
Authors: Nicolas Bourriez, Alexandre Verine, Auguste Genovesio,
Abstract summary: Conditional diffusion inversion fails when the source domain is spectrally sparse compared to the target domain.<n>We propose Orthogonal Variance Guidance (OVG), an inference-time method that corrects the ODE dynamics to enforce the theoretical Gaussian noise magnitude.<n>OVG effectively restores photorealistic textures while preserving structural fidelity.
Score: 44.781674986581244
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Conditional diffusion inversion provides a powerful framework for unpaired image-to-image translation. However, we demonstrate through an extensive analysis that standard deterministic inversion (e.g. DDIM) fails when the source domain is spectrally sparse compared to the target domain (e.g., super-resolution, sketch-to-image). In these contexts, the recovered latent from the input does not follow the expected isotropic Gaussian distribution. Instead it exhibits a signal with lower frequencies, locking target sampling to oversmoothed and texture-poor generations. We term this phenomenon spectral collapse. We observe that stochastic alternatives attempting to restore the noise variance tend to break the semantic link to the input, leading to structural drift. To resolve this structure-texture trade-off, we propose Orthogonal Variance Guidance (OVG), an inference-time method that corrects the ODE dynamics to enforce the theoretical Gaussian noise magnitude within the null-space of the structural gradient. Extensive experiments on microscopy super-resolution (BBBC021) and sketch-to-image (Edges2Shoes) demonstrate that OVG effectively restores photorealistic textures while preserving structural fidelity.

Related papers

The Malignant Tail: Spectral Segregation of Label Noise in Over-Parameterized Networks [0.0]
We experimentally isolate the Malignant Tail, a failure mode where networks functionally segregate signal and noise.<n>We show that untrained networks actively segregate noise, allowing post-hoc Explicit Spectral Truncation to surgically prune the noise-dominated subspace.<n>Our findings suggest that under label noise, excess spectral capacity is not harmless redundancy but a latent structural liability.
arXiv Detail & Related papers (2026-03-02T16:39:42Z)
Breaking the Bottlenecks: Scalable Diffusion Models for 3D Molecular Generation [0.0]
Diffusion models have emerged as a powerful class of generative models for molecular design.<n>Their use remains constrained by long sampling trajectories, variance in the reverse process, and limited structural awareness in denoising dynamics.<n>The Directly Denoising Diffusion Model mitigates these inefficiencies by replacing reverse MCMC updates with deterministic denoising step.
arXiv Detail & Related papers (2026-01-13T20:09:44Z)
FreSca: Scaling in Frequency Space Enhances Diffusion Models [55.75504192166779]
This paper explores frequency-based control within latent diffusion models.<n>We introduce FreSca, a novel framework that decomposes noise difference into low- and high-frequency components.<n>FreSca operates without any model retraining or architectural change, offering model- and task-agnostic control.
arXiv Detail & Related papers (2025-04-02T22:03:11Z)
Rao-Blackwell Gradient Estimators for Equivariant Denoising Diffusion [55.95767828747407]
In domains such as molecular and protein generation, physical systems exhibit inherent symmetries that are critical to model.<n>We present a framework that reduces training variance and provides a provably lower-variance gradient estimator.<n>We also present a practical implementation of this estimator incorporating the loss and sampling procedure through a method we call Orbit Diffusion.
arXiv Detail & Related papers (2025-02-14T03:26:57Z)
Any-Resolution AI-Generated Image Detection by Spectral Learning [36.562914181733426]
We build upon the key idea that the spectral distribution of real images constitutes both an invariant and highly discriminative pattern for AI-generated image detection.<n>Our approach achieves a 5.5% absolute improvement in AUC over the previous state-of-the-art across 13 recent generative approaches.
arXiv Detail & Related papers (2024-11-28T23:55:19Z)
There and Back Again: On the relation between Noise and Image Inversions in Diffusion Models [3.8384683391475556]
Diffusion Models generate new samples but lack a low-dimensional latent space that encodes the data into editable features.<n>Inversion-based methods address this by reversing the denoising trajectory, transferring images to their approximated starting noise.<n>We show that latents exhibit structural patterns in the form of less diverse noise predicted for smooth image areas.
arXiv Detail & Related papers (2024-10-31T00:30:35Z)
Spectrum Translation for Refinement of Image Generation (STIG) Based on Contrastive Learning and Spectral Filter Profile [15.5188527312094]
We propose a framework to mitigate the disparity in frequency domain of the generated images. This is realized by spectrum translation for the refinement of image generation (STIG) based on contrastive learning. We evaluate our framework across eight fake image datasets and various cutting-edge models to demonstrate the effectiveness of STIG.
arXiv Detail & Related papers (2024-03-08T06:39:24Z)
A Variational Perspective on Solving Inverse Problems with Diffusion Models [101.831766524264]
Inverse tasks can be formulated as inferring a posterior distribution over data. This is however challenging in diffusion models since the nonlinear and iterative nature of the diffusion process renders the posterior intractable. We propose a variational approach that by design seeks to approximate the true posterior distribution.
arXiv Detail & Related papers (2023-05-07T23:00:47Z)
Orthogonal Matrix Retrieval with Spatial Consensus for 3D Unknown-View Tomography [58.60249163402822]
Unknown-view tomography (UVT) reconstructs a 3D density map from its 2D projections at unknown, random orientations. The proposed OMR is more robust and performs significantly better than the previous state-of-the-art OMR approach.
arXiv Detail & Related papers (2022-07-06T21:40:59Z)
Gaussian MRF Covariance Modeling for Efficient Black-Box Adversarial Attacks [86.88061841975482]
We study the problem of generating adversarial examples in a black-box setting, where we only have access to a zeroth order oracle. We use this setting to find fast one-step adversarial attacks, akin to a black-box version of the Fast Gradient Sign Method(FGSM) We show that the method uses fewer queries and achieves higher attack success rates than the current state of the art.
arXiv Detail & Related papers (2020-10-08T18:36:51Z)
Hyperspectral Image Denoising with Partially Orthogonal Matrix Vector Tensor Factorization [42.56231647066719]
Hyperspectral image (HSI) has some advantages over natural image for various applications due to the extra spectral information. During the acquisition, it is often contaminated by severe noises including Gaussian noise, impulse noise, deadlines, and stripes. We present a HSI restoration method named smooth and robust low rank tensor recovery.
arXiv Detail & Related papers (2020-06-29T02:10:07Z)
Residual-Sparse Fuzzy $C$-Means Clustering Incorporating Morphological Reconstruction and Wavelet frames [146.63177174491082]
Fuzzy $C$-Means (FCM) algorithm incorporates a morphological reconstruction operation and a tight wavelet frame transform. We present an improved FCM algorithm by imposing an $ell_0$ regularization term on the residual between the feature set and its ideal value. Experimental results reported for synthetic, medical, and color images show that the proposed algorithm is effective and efficient, and outperforms other algorithms.
arXiv Detail & Related papers (2020-02-14T10:00:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.