From Circuits to Dynamics: Understanding and Stabilizing Failure in 3D Diffusion Transformers
- URL: http://arxiv.org/abs/2602.11130v1
- Date: Wed, 11 Feb 2026 18:42:05 GMT
- Title: From Circuits to Dynamics: Understanding and Stabilizing Failure in 3D Diffusion Transformers
- Authors: Maximilian Plattner, Fabian Paischer, Johannes Brandstetter, Arturs Berzins,
- Abstract summary: 3D diffusion transformers exhibit a catastrophic mode of failure.<n>We call this phenomenon Meltdown.<n>We introduce PowerRemap, a test-time control that stabilizes sparse point-cloud conditioning.
- Score: 25.11520870904882
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Reliable surface completion from sparse point clouds underpins many applications spanning content creation and robotics. While 3D diffusion transformers attain state-of-the-art results on this task, we uncover that they exhibit a catastrophic mode of failure: arbitrarily small on-surface perturbations to the input point cloud can fracture the output into multiple disconnected pieces -- a phenomenon we call Meltdown. Using activation-patching from mechanistic interpretability, we localize Meltdown to a single early denoising cross-attention activation. We find that the singular-value spectrum of this activation provides a scalar proxy: its spectral entropy rises when fragmentation occurs and returns to baseline when patched. Interpreted through diffusion dynamics, we show that this proxy tracks a symmetry-breaking bifurcation of the reverse process. Guided by this insight, we introduce PowerRemap, a test-time control that stabilizes sparse point-cloud conditioning. We demonstrate that Meltdown persists across state-of-the-art architectures (WaLa, Make-a-Shape), datasets (GSO, SimJEB) and denoising strategies (DDPM, DDIM), and that PowerRemap effectively counters this failure with stabilization rates of up to 98.3%. Overall, this work is a case study on how diffusion model behavior can be understood and guided based on mechanistic analysis, linking a circuit-level cross-attention mechanism to diffusion-dynamics accounts of trajectory bifurcations.
Related papers
- MirrorLA: Reflecting Feature Map for Vision Linear Attention [49.41670925034762]
Linear attention significantly reduces the computational complexity of Transformers from quadratic to linear, yet it consistently lags behind softmax-based attention in performance.<n>We propose MirrorLA, a geometric framework that substitutes passive truncation with active reorientation.<n>MirrorLA achieves state-of-the-art performance across standard benchmarks, demonstrating that strictly linear efficiency can be achieved without compromising representational fidelity.
arXiv Detail & Related papers (2026-02-04T09:14:09Z) - Causal Autoregressive Diffusion Language Model [70.7353007255797]
CARD reformulates the diffusion process within a strictly causal attention mask, enabling dense, per-token supervision in a single forward pass.<n>Our results demonstrate that CARD achieves ARM-level data efficiency while unlocking the latency benefits of parallel generation.
arXiv Detail & Related papers (2026-01-29T17:38:29Z) - Breaking the Bottlenecks: Scalable Diffusion Models for 3D Molecular Generation [0.0]
Diffusion models have emerged as a powerful class of generative models for molecular design.<n>Their use remains constrained by long sampling trajectories, variance in the reverse process, and limited structural awareness in denoising dynamics.<n>The Directly Denoising Diffusion Model mitigates these inefficiencies by replacing reverse MCMC updates with deterministic denoising step.
arXiv Detail & Related papers (2026-01-13T20:09:44Z) - Analyzing the Mechanism of Attention Collapse in VGGT from a Dynamics Perspective [13.434698786044107]
Visual Geometry Grounded Transformer (VGGT) delivers state-of-the-art feed-forward 3D reconstruction.<n>Its global self-attention layer suffers from a drastic collapse phenomenon when the input sequence exceeds a few hundred frames.<n>We establish a rigorous mathematical explanation of the collapse by viewing the global-attention as a degenerate diffusion process.
arXiv Detail & Related papers (2025-12-25T14:34:27Z) - Exploring Magnitude Preservation and Rotation Modulation in Diffusion Transformers [5.187307904567701]
We propose a magnitude-preserving design that stabilizes training without normalization layers.<n>Motivated by the goal of maintaining activation magnitudes, we additionally introduce rotation modulation.<n>We show that magnitude-preserving strategies significantly improve performance, notably reducing FID scores by $sim$12.8%.
arXiv Detail & Related papers (2025-05-25T12:25:50Z) - Towards Unified and Lossless Latent Space for 3D Molecular Latent Diffusion Modeling [90.23688195918432]
3D molecule generation is crucial for drug discovery and material science.<n>Existing approaches typically maintain separate latent spaces for invariant and equivariant modalities.<n>We propose textbfUAE-3D, a multi-modal VAE that compresses 3D molecules into latent sequences from a unified latent space.
arXiv Detail & Related papers (2025-03-19T08:56:13Z) - Towards Stabilized and Efficient Diffusion Transformers through Long-Skip-Connections with Spectral Constraints [51.83081671798784]
Diffusion Transformers (DiT) have emerged as a powerful architecture for image and video generation, offering superior quality and scalability.<n>DiT's practical application suffers from inherent dynamic feature instability, leading to error amplification during cached inference.<n>We propose Skip-DiT, an image and video generative DiT variant enhanced with Long-Skip-Connections (LSCs) - the key efficiency component in U-Nets.
arXiv Detail & Related papers (2024-11-26T17:28:10Z) - Predicting Cascading Failures with a Hyperparametric Diffusion Model [66.89499978864741]
We study cascading failures in power grids through the lens of diffusion models.
Our model integrates viral diffusion principles with physics-based concepts.
We show that this diffusion model can be learned from traces of cascading failures.
arXiv Detail & Related papers (2024-06-12T02:34:24Z) - Dynamic Addition of Noise in a Diffusion Model for Anomaly Detection [2.209921757303168]
Diffusion models have found valuable applications in anomaly detection by capturing the nominal data distribution and identifying anomalies via reconstruction.
Despite their merits, they struggle to localize anomalies of varying scales, especially larger anomalies such as entire missing components.
We present a novel framework that enhances the capability of diffusion models, by extending the previous introduced implicit conditioning approach Meng et al.
2022 in three significant ways.
arXiv Detail & Related papers (2024-01-09T09:57:38Z) - Reminiscence of classical chaos in driven transmons [117.851325578242]
We show that even off-resonant drives can cause strong modifications to the structure of the transmon spectrum rendering a large part of it chaotic.
Results lead to a photon number threshold characterizing the appearance of chaos-induced quantum demolition effects.
arXiv Detail & Related papers (2022-07-19T16:04:46Z) - Non-trivial effect of dephasing: Enhancement of rectification of spin
current in graded XX chains [0.0]
We consider dephasing noise modelled by current preserving Lindblad dissipators acting on graded versions of spin systems.
We find that the interplay between dephasing and graded systems gives rise to a non trivial behavior.
arXiv Detail & Related papers (2022-07-06T13:58:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.