Adaptive Classifier-Free Guidance via Dynamic Low-Confidence Masking
- URL: http://arxiv.org/abs/2505.20199v1
- Date: Mon, 26 May 2025 16:40:22 GMT
- Title: Adaptive Classifier-Free Guidance via Dynamic Low-Confidence Masking
- Authors: Pengxiang Li, Shilin Yan, Joey Tsai, Renrui Zhang, Ruichuan An, Ziyu Guo, Xiaowei Gao,
- Abstract summary: We introduce Adaptive-Free Guidance (A-CFG), a novel method that tailors unconditional input by leveraging the model's predictive confidence.<n>A-CFG focuses on areas of ambiguity leading to more effective guidance.<n> Experiments on diverse language generation benchmarks show that A-CFG yields substantial improvements over standard CFG.
- Score: 15.052244821404079
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Classifier-Free Guidance (CFG) significantly enhances controllability in generative models by interpolating conditional and unconditional predictions. However, standard CFG often employs a static unconditional input, which can be suboptimal for iterative generation processes where model uncertainty varies dynamically. We introduce Adaptive Classifier-Free Guidance (A-CFG), a novel method that tailors the unconditional input by leveraging the model's instantaneous predictive confidence. At each step of an iterative (masked) diffusion language model, A-CFG identifies tokens in the currently generated sequence for which the model exhibits low confidence. These tokens are temporarily re-masked to create a dynamic, localized unconditional input. This focuses CFG's corrective influence precisely on areas of ambiguity, leading to more effective guidance. We integrate A-CFG into a state-of-the-art masked diffusion language model and demonstrate its efficacy. Experiments on diverse language generation benchmarks show that A-CFG yields substantial improvements over standard CFG, achieving, for instance, a 3.9 point gain on GPQA. Our work highlights the benefit of dynamically adapting guidance mechanisms to model uncertainty in iterative generation.
Related papers
- Decoupled Classifier-Free Guidance for Counterfactual Diffusion Models [17.44485184010655]
Decoupled-Free Guidance (DCFG) is a flexible and model-agnostic framework that introduces group-wise conditioning control.<n>DCFG builds on an attribute-split embedding strategy that disentangles semantic inputs, enabling selective guidance on user-defined attribute groups.<n>Experiments on CelebA-HQ, MIMIC-CXR, and EMBED show that DCFG improves intervention fidelity, mitigates unintended changes, and enhances reversibility.
arXiv Detail & Related papers (2025-06-17T10:56:09Z) - Token Perturbation Guidance for Diffusion Models [1.511194037740325]
Token Perturbation Guidance (TPG) is a novel method that applies matrices directly to intermediate token representations within the diffusion network.<n>TPG is training-free and agnostic to input conditions, making it readily applicable to both conditional and unconditional generation.
arXiv Detail & Related papers (2025-06-10T21:25:46Z) - Feedback Guidance of Diffusion Models [0.0]
Interval-Free Guidance (CFG) has become standard for improving sample fidelity in conditional diffusion models.<n>We propose FeedBack Guidance (FBG), which uses a state-dependent coefficient to self-regulate guidance amounts based on need.
arXiv Detail & Related papers (2025-06-06T13:46:32Z) - Normalized Attention Guidance: Universal Negative Guidance for Diffusion Models [57.20761595019967]
We present Normalized Attention Guidance (NAG), an efficient, training-free mechanism that applies extrapolation in attention space with L1-based normalization and refinement.<n>NAG restores effective negative guidance where CFG collapses while maintaining fidelity.<n>NAG generalizes across architectures (UNet, DiT), sampling regimes (few-step, multi-step), and modalities (image, video)
arXiv Detail & Related papers (2025-05-27T13:30:46Z) - CtrlDiff: Boosting Large Diffusion Language Models with Dynamic Block Prediction and Controllable Generation [7.250878248686215]
Diffusion-based language models have emerged as a compelling alternative due to their powerful parallel generation capabilities and inherent editability.<n>We propose CtrlDiff, a dynamic and controllable semi-autoregressive framework that adaptively determines the size of each generation block based on local semantics.
arXiv Detail & Related papers (2025-05-20T14:52:41Z) - FELLE: Autoregressive Speech Synthesis with Token-Wise Coarse-to-Fine Flow Matching [51.32059240975148]
FELLE is an autoregressive model that integrates language modeling with token-wise flow matching.<n>For each continuous-valued token, FELLE modifies the general prior distribution in flow matching by incorporating information from the previous step.<n>FELLE generates continuous-valued tokens hierarchically, conditioned on the language model's output.
arXiv Detail & Related papers (2025-02-16T13:54:32Z) - Contrastive CFG: Improving CFG in Diffusion Models by Contrasting Positive and Negative Concepts [55.298031232672734]
As-Free Guidance (CFG) has proven effective in conditional diffusion model sampling for improved condition alignment.
We present a novel method to enhance negative CFG guidance using contrastive loss.
arXiv Detail & Related papers (2024-11-26T03:29:27Z) - Steering Masked Discrete Diffusion Models via Discrete Denoising Posterior Prediction [88.65168366064061]
We introduce Discrete Denoising Posterior Prediction (DDPP), a novel framework that casts the task of steering pre-trained MDMs as a problem of probabilistic inference.
Our framework leads to a family of three novel objectives that are all simulation-free, and thus scalable.
We substantiate our designs via wet-lab validation, where we observe transient expression of reward-optimized protein sequences.
arXiv Detail & Related papers (2024-10-10T17:18:30Z) - Eliminating Oversaturation and Artifacts of High Guidance Scales in Diffusion Models [27.640009920058187]
We revisit the CFG update rule and introduce modifications to address this issue.
We propose down-weighting the parallel component to achieve high-quality generations without oversaturation.
We also introduce a new rescaling momentum method for the CFG update rule based on this insight.
arXiv Detail & Related papers (2024-10-03T12:06:29Z) - No Training, No Problem: Rethinking Classifier-Free Guidance for Diffusion Models [25.301443993960277]
We revisit the core principles of CFG and introduce a new method, independent condition guidance (ICG)
ICG provides the benefits of CFG without the need for any special training procedures.
Our approach streamlines the training process of conditional diffusion models and can also be applied during inference on any pre-trained conditional model.
arXiv Detail & Related papers (2024-07-02T22:04:00Z) - Adaptive Guidance: Training-free Acceleration of Conditional Diffusion
Models [44.58960475893552]
"Adaptive Guidance" (AG) is an efficient variant of computation-Free Guidance (CFG)
AG preserves CFG's image quality while reducing by 25%.
" LinearAG" offers even cheaper inference at the cost of deviating from the baseline model.
arXiv Detail & Related papers (2023-12-19T17:08:48Z) - Continual-MAE: Adaptive Distribution Masked Autoencoders for Continual Test-Time Adaptation [49.827306773992376]
Continual Test-Time Adaptation (CTTA) is proposed to migrate a source pre-trained model to continually changing target distributions.
Our proposed method attains state-of-the-art performance in both classification and segmentation CTTA tasks.
arXiv Detail & Related papers (2023-12-19T15:34:52Z) - Consistency Regularization for Generalizable Source-free Domain
Adaptation [62.654883736925456]
Source-free domain adaptation (SFDA) aims to adapt a well-trained source model to an unlabelled target domain without accessing the source dataset.
Existing SFDA methods ONLY assess their adapted models on the target training set, neglecting the data from unseen but identically distributed testing sets.
We propose a consistency regularization framework to develop a more generalizable SFDA method.
arXiv Detail & Related papers (2023-08-03T07:45:53Z) - Variational Classification [51.2541371924591]
We derive a variational objective to train the model, analogous to the evidence lower bound (ELBO) used to train variational auto-encoders.
Treating inputs to the softmax layer as samples of a latent variable, our abstracted perspective reveals a potential inconsistency.
We induce a chosen latent distribution, instead of the implicit assumption found in a standard softmax layer.
arXiv Detail & Related papers (2023-05-17T17:47:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.