Related papers: Gradient Variance Reveals Failure Modes in Flow-Based Generative Models

Gradient Variance Reveals Failure Modes in Flow-Based Generative Models

URL: http://arxiv.org/abs/2510.18118v1
Date: Mon, 20 Oct 2025 21:37:11 GMT
Title: Gradient Variance Reveals Failure Modes in Flow-Based Generative Models
Authors: Teodora Reu, Sixtine Dromigny, Michael Bronstein, Francisco Vargas,
Abstract summary: We show that Rectified Flows learn ODE vector fields whose trajectories are straight between source and target distributions, enabling near one-step inference.<n>We prove that a memorizing vector field exists even when training interpolants intersect, and that optimizing the straight-path objective converges to this ill-defined field.
Score: 2.4223685315022867
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Rectified Flows learn ODE vector fields whose trajectories are straight between source and target distributions, enabling near one-step inference. We show that this straight-path objective conceals fundamental failure modes: under deterministic training, low gradient variance drives memorization of arbitrary training pairings, even when interpolant lines between pairs intersect. To analyze this mechanism, we study Gaussian-to-Gaussian transport and use the loss gradient variance across stochastic and deterministic regimes to characterize which vector fields optimization favors in each setting. We then show that, in a setting where all interpolating lines intersect, applying Rectified Flow yields the same specific pairings at inference as during training. More generally, we prove that a memorizing vector field exists even when training interpolants intersect, and that optimizing the straight-path objective converges to this ill-defined field. At inference, deterministic integration reproduces the exact training pairings. We validate our findings empirically on the CelebA dataset, confirming that deterministic interpolants induce memorization, while the injection of small noise restores generalization.

Related papers

Flow Matching is Adaptive to Manifold Structures [32.55405572762157]
Flow matching is a simulation-based alternative to diffusion-based generative modeling.<n>We show how flow matching adapts to data geometry and circumvents the curse of dimensionality.
arXiv Detail & Related papers (2026-02-25T23:52:32Z)
Binary Flow Matching: Prediction-Loss Space Alignment for Robust Learning [23.616336786063552]
Flow matching has emerged as a powerful framework for generative modeling.<n>We identify a latent structural mismatch that arises when it is coupled with velocity-based objectives.<n>We prove that re-aligning the objective to the signal space eliminates the singular weighting.
arXiv Detail & Related papers (2026-02-11T02:02:30Z)
Multi-Marginal Flow Matching with Adversarially Learnt Interpolants [27.294164408278448]
This paper proposes a novel flow matching method that overcomes the limitations of existing multi-marginal trajectory inference algorithms.<n>Our proposed method, ALI-CFM, uses a GAN-inspired adversarial loss to fit neurally parametrised interpolant curves between source and target points.<n>We showcase the versatility and scalability of our method by outperforming the existing baselines on spatial transcriptomics and cell tracking datasets.
arXiv Detail & Related papers (2025-10-01T17:47:27Z)
Variational Rectified Flow Matching [100.63726791602049]
Variational Rectified Flow Matching enhances classic rectified flow matching by modeling multi-modal velocity vector-fields.<n>We show on synthetic data that variational rectified flow matching leads to compelling results.
arXiv Detail & Related papers (2025-02-13T18:59:15Z)
Flow Matching: Markov Kernels, Stochastic Processes and Transport Plans [1.9766522384767222]
Flow matching techniques can be used to solve inverse problems.<n>We show how flow matching can be used for solving inverse problems.<n>We briefly address continuous normalizing flows and score matching techniques.
arXiv Detail & Related papers (2025-01-28T10:28:17Z)
On the Wasserstein Convergence and Straightness of Rectified Flow [54.580605276017096]
Rectified Flow (RF) is a generative model that aims to learn straight flow trajectories from noise to data.<n>We provide a theoretical analysis of the Wasserstein distance between the sampling distribution of RF and the target distribution.<n>We present general conditions guaranteeing uniqueness and straightness of 1-RF, which is in line with previous empirical findings.
arXiv Detail & Related papers (2024-10-19T02:36:11Z)
Efficient Trajectory Inference in Wasserstein Space Using Consecutive Averaging [3.8623569699070353]
Trajectory inference deals with reconstructing continuous processes from such observations.<n>We propose methods for B-spline approximation and of point clouds through consecutive averaging that is intrinsic to the Wasserstein space.<n>We prove linear convergence rates and rigorously evaluate our method on cell data characterized by bifurcations, merges, and trajectory splitting scenarios.
arXiv Detail & Related papers (2024-05-30T04:19:20Z)
The High Line: Exact Risk and Learning Rate Curves of Stochastic Adaptive Learning Rate Algorithms [8.681909776958184]
We develop a framework for analyzing the training and learning rate dynamics on a large class of high-dimensional optimization problems. We give exact expressions for the risk and learning rate curves in terms of a deterministic solution to a system of ODEs. We investigate in detail two adaptive learning rates -- an idealized exact line search and AdaGrad-Norm on the least squares problem.
arXiv Detail & Related papers (2024-05-30T00:27:52Z)
Time-series Generation by Contrastive Imitation [87.51882102248395]
We study a generative framework that seeks to combine the strengths of both: Motivated by a moment-matching objective to mitigate compounding error, we optimize a local (but forward-looking) transition policy. At inference, the learned policy serves as the generator for iterative sampling, and the learned energy serves as a trajectory-level measure for evaluating sample quality.
arXiv Detail & Related papers (2023-11-02T16:45:25Z)
Semi-DETR: Semi-Supervised Object Detection with Detection Transformers [105.45018934087076]
We analyze the DETR-based framework on semi-supervised object detection (SSOD) We present Semi-DETR, the first transformer-based end-to-end semi-supervised object detector. Our method outperforms all state-of-the-art methods by clear margins.
arXiv Detail & Related papers (2023-07-16T16:32:14Z)
Adaptive Annealed Importance Sampling with Constant Rate Progress [68.8204255655161]
Annealed Importance Sampling (AIS) synthesizes weighted samples from an intractable distribution. We propose the Constant Rate AIS algorithm and its efficient implementation for $alpha$-divergences.
arXiv Detail & Related papers (2023-06-27T08:15:28Z)
Intersection of Parallels as an Early Stopping Criterion [64.8387564654474]
We propose a method to spot an early stopping point in the training iterations without the need for a validation set. For a wide range of learning rates, our method, called Cosine-Distance Criterion (CDC), leads to better generalization on average than all the methods that we compare against.
arXiv Detail & Related papers (2022-08-19T19:42:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.