Related papers: Improving Classifier-Free Guidance of Flow Matching via Manifold Projection

Improving Classifier-Free Guidance of Flow Matching via Manifold Projection

URL: http://arxiv.org/abs/2601.21892v1
Date: Thu, 29 Jan 2026 15:49:31 GMT
Title: Improving Classifier-Free Guidance of Flow Matching via Manifold Projection
Authors: Jian-Feng Cai, Haixia Liu, Zhengyi Su, Chao Wang,
Abstract summary: We provide a principled interpretation of CFG through the lens of optimization.<n>We reformulate the CFG sampling as a homotopy optimization with manifold constraint.<n>Our proposed methods are training-free and consistently refine generation fidelity, prompt alignment, and robustness to the guidance scale.
Score: 3.6087998976768128
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Classifier-free guidance (CFG) is a widely used technique for controllable generation in diffusion and flow-based models. Despite its empirical success, CFG relies on a heuristic linear extrapolation that is often sensitive to the guidance scale. In this work, we provide a principled interpretation of CFG through the lens of optimization. We demonstrate that the velocity field in flow matching corresponds to the gradient of a sequence of smoothed distance functions, which guides latent variables toward the scaled target image set. This perspective reveals that the standard CFG formulation is an approximation of this gradient, where the prediction gap, the discrepancy between conditional and unconditional outputs, governs guidance sensitivity. Leveraging this insight, we reformulate the CFG sampling as a homotopy optimization with a manifold constraint. This formulation necessitates a manifold projection step, which we implement via an incremental gradient descent scheme during sampling. To improve computational efficiency and stability, we further enhance this iterative process with Anderson Acceleration without requiring additional model evaluations. Our proposed methods are training-free and consistently refine generation fidelity, prompt alignment, and robustness to the guidance scale. We validate their effectiveness across diverse benchmarks, demonstrating significant improvements on large-scale models such as DiT-XL-2-256, Flux, and Stable Diffusion 3.5.

Related papers

CFG-Ctrl: Control-Based Classifier-Free Diffusion Guidance [31.552164852288325]
We introduce Sliding Mode Control CFG (SMC-CFG), which enforces the generative flow toward a rapidly convergent sliding manifold.<n>SMC-CFG outperforms standard CFG in semantic alignment and robustness across a wide range of guidance scales.
arXiv Detail & Related papers (2026-03-03T18:59:48Z)
LFPO: Likelihood-Free Policy Optimization for Masked Diffusion Models [48.68246945083386]
Likelihood-Free Policy Optimization (LFPO) is a native framework that maps the concept of vector field flow matching to the discrete token space.<n>LFPO formulates alignment as geometric velocity rectification, which directly optimize denoising logits via contrastive updates.<n>Experiments demonstrate that LFPO not only outperforms state-of-the-art baselines on code and reasoning benchmarks but also accelerates inference by approximately 20% through reduced diffusion steps.
arXiv Detail & Related papers (2026-03-02T07:42:55Z)
Enhancing Diffusion Model Guidance through Calibration and Regularization [9.22066257345387]
This paper introduces two complementary contributions to address this issue.<n>First, we propose a differentiable calibration objective based on the Smooth Expected Error (Smooth ECE)<n>Second, we develop enhanced sampling guidance methods that operate on off-the-shelf classifiers without requiring retraining.
arXiv Detail & Related papers (2025-11-08T04:23:42Z)
Rectified-CFG++ for Flow Based Models [26.896426878221718]
We present Rectified-C++, an adaptive predictor-corrector guidance that couples the deterministic efficiency of rectified flows with a geometry-aware conditioning rule.<n>Experiments on large-scale text-to-image models (Flux, Stable Diffusion 3/3.5, Lumina) show that Rectified-C++ consistently outperforms standard CFG on benchmark datasets.
arXiv Detail & Related papers (2025-10-09T00:00:47Z)
DiffusionNFT: Online Diffusion Reinforcement with Forward Process [99.94852379720153]
Diffusion Negative-aware FineTuning (DiffusionNFT) is a new online RL paradigm that optimize diffusion models directly on the forward process via flow matching.<n>DiffusionNFT is up to $25times$ more efficient than FlowGRPO in head-to-head comparisons, while being CFG-free.
arXiv Detail & Related papers (2025-09-19T16:09:33Z)
Solving Inverse Problems with FLAIR [68.87167940623318]
We present FLAIR, a training-free variational framework that leverages flow-based generative models as prior for inverse problems.<n>Results on standard imaging benchmarks demonstrate that FLAIR consistently outperforms existing diffusion- and flow-based methods in terms of reconstruction quality and sample diversity.
arXiv Detail & Related papers (2025-06-03T09:29:47Z)
Contrastive CFG: Improving CFG in Diffusion Models by Contrasting Positive and Negative Concepts [55.298031232672734]
As-Free Guidance (CFG) has proven effective in conditional diffusion model sampling for improved condition alignment. We present a novel method to enhance negative CFG guidance using contrastive loss.
arXiv Detail & Related papers (2024-11-26T03:29:27Z)
Eliminating Oversaturation and Artifacts of High Guidance Scales in Diffusion Models [27.640009920058187]
We revisit the CFG update rule and introduce modifications to address this issue.<n>We propose down-weighting the parallel component to achieve high-quality generations without oversaturation.<n>We also introduce a new rescaling momentum method for the CFG update rule based on this insight.
arXiv Detail & Related papers (2024-10-03T12:06:29Z)
Adaptive Federated Learning Over the Air [108.62635460744109]
We propose a federated version of adaptive gradient methods, particularly AdaGrad and Adam, within the framework of over-the-air model training. Our analysis shows that the AdaGrad-based training algorithm converges to a stationary point at the rate of $mathcalO( ln(T) / T 1 - frac1alpha ).
arXiv Detail & Related papers (2024-03-11T09:10:37Z)
Adaptive Guidance: Training-free Acceleration of Conditional Diffusion Models [44.58960475893552]
"Adaptive Guidance" (AG) is an efficient variant of computation-Free Guidance (CFG) AG preserves CFG's image quality while reducing by 25%. " LinearAG" offers even cheaper inference at the cost of deviating from the baseline model.
arXiv Detail & Related papers (2023-12-19T17:08:48Z)
End-to-End Diffusion Latent Optimization Improves Classifier Guidance [81.27364542975235]
Direct Optimization of Diffusion Latents (DOODL) is a novel guidance method. It enables plug-and-play guidance by optimizing diffusion latents. It outperforms one-step classifier guidance on computational and human evaluation metrics.
arXiv Detail & Related papers (2023-03-23T22:43:52Z)
Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose. We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.