Related papers: EMAG: Self-Rectifying Diffusion Sampling with Exponential Moving Average Guidance

EMAG: Self-Rectifying Diffusion Sampling with Exponential Moving Average Guidance

URL: http://arxiv.org/abs/2512.17303v1
Date: Fri, 19 Dec 2025 07:36:07 GMT
Title: EMAG: Self-Rectifying Diffusion Sampling with Exponential Moving Average Guidance
Authors: Ankit Yadav, Ta Duc Huy, Lingqiao Liu,
Abstract summary: In diffusion and flow-matching generative models, guidance techniques are widely used to improve sample quality and consistency.<n>Recent work explores contrasting negative samples at inference using a weaker model.<n>We propose Exponential Moving Average Guidance (EMAG), a training-free mechanism that modifies attention at inference time in diffusion transformers.
Score: 31.550239698285058
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: In diffusion and flow-matching generative models, guidance techniques are widely used to improve sample quality and consistency. Classifier-free guidance (CFG) is the de facto choice in modern systems and achieves this by contrasting conditional and unconditional samples. Recent work explores contrasting negative samples at inference using a weaker model, via strong/weak model pairs, attention-based masking, stochastic block dropping, or perturbations to the self-attention energy landscape. While these strategies refine the generation quality, they still lack reliable control over the granularity or difficulty of the negative samples, and target-layer selection is often fixed. We propose Exponential Moving Average Guidance (EMAG), a training-free mechanism that modifies attention at inference time in diffusion transformers, with a statistics-based, adaptive layer-selection rule. Unlike prior methods, EMAG produces harder, semantically faithful negatives (fine-grained degradations), surfacing difficult failure modes, enabling the denoiser to refine subtle artifacts, boosting the quality and human preference score (HPS) by +0.46 over CFG. We further demonstrate that EMAG naturally composes with advanced guidance techniques, such as APG and CADS, further improving HPS.

Related papers

Toward the Frontiers of Reliable Diffusion Sampling via Adversarial Sinkhorn Attention Guidance [8.46069844016289]
Adversarial Sinkhorn Attention Guidance (ASAG) is a novel method that reinterprets attention scores in diffusion models through the lens of optimal transport.<n>Instead of naively corrupting the attention mechanism, ASAG injects an adversarial cost within self-attention layers to reduce pixel-wise similarity between queries and keys.<n>ASAG shows consistent improvements in text-to-image diffusion, and enhances controllability and fidelity in downstream applications such as IP-Adapter and ControlNet.
arXiv Detail & Related papers (2025-11-10T15:52:53Z)
Learning Robust Diffusion Models from Imprecise Supervision [75.53546939251146]
DMIS is a unified framework for training robust Conditional Diffusion Models from Imprecise Supervision.<n>Our framework is derived from likelihood and decomposes the objective into generative and classification components.<n>Experiments on diverse forms of imprecise supervision, covering tasks covering image generation, weakly supervised learning, and dataset condensation demonstrate that DMIS consistently produces high-quality and class-discriminative samples.
arXiv Detail & Related papers (2025-10-03T14:00:32Z)
G$^2$RPO: Granular GRPO for Precise Reward in Flow Models [74.21206048155669]
We propose a novel Granular-GRPO (G$2$RPO) framework that achieves precise and comprehensive reward assessments of sampling directions.<n>We introduce a Multi-Granularity Advantage Integration module that aggregates advantages computed at multiple diffusion scales.<n>Our G$2$RPO significantly outperforms existing flow-based GRPO baselines.
arXiv Detail & Related papers (2025-10-02T12:57:12Z)
Generate Aligned Anomaly: Region-Guided Few-Shot Anomaly Image-Mask Pair Synthesis for Industrial Inspection [53.137651284042434]
Anomaly inspection plays a vital role in industrial manufacturing, but the scarcity of anomaly samples limits the effectiveness of existing methods.<n>We propose Generate grained Anomaly (GAA), a region-guided, few-shot anomaly image-mask pair generation framework.<n>GAA generates realistic, diverse, and semantically aligned anomalies using only a small number of samples.
arXiv Detail & Related papers (2025-07-13T12:56:59Z)
ScoreAdv: Score-based Targeted Generation of Natural Adversarial Examples via Diffusion Models [13.250999667915254]
In this paper, we introduce a novel approach for generating adversarial examples based on diffusion models, named ScoreAdv.<n>Our method is capable of generating an unlimited number of natural adversarial examples and can attack not only classification models but also retrieval models.<n>Our results demonstrate that ScoreAdv achieves state-of-the-art attack success rates and image quality, while maintaining inference efficiency.
arXiv Detail & Related papers (2025-07-08T15:17:24Z)
Diffusion Sampling Path Tells More: An Efficient Plug-and-Play Strategy for Sample Filtering [18.543769006014383]
Diffusion models often exhibit inconsistent sample quality due to variations inherent in their sampling trajectories.<n>We introduce CFG-Rejection, an efficient, plug-and-play strategy that filters low-quality samples at an early stage of the denoising process.<n>We validate the effectiveness of CFG-Rejection in image generation through extensive experiments.
arXiv Detail & Related papers (2025-05-29T11:08:24Z)
Normalized Attention Guidance: Universal Negative Guidance for Diffusion Models [57.20761595019967]
We present Normalized Attention Guidance (NAG), an efficient, training-free mechanism that applies extrapolation in attention space with L1-based normalization and refinement.<n>NAG restores effective negative guidance where CFG collapses while maintaining fidelity.<n>NAG generalizes across architectures (UNet, DiT), sampling regimes (few-step, multi-step), and modalities (image, video)
arXiv Detail & Related papers (2025-05-27T13:30:46Z)
Self-Guidance: Boosting Flow and Diffusion Generation on Their Own [35.56845917727121]
Self-Guidance (SG) can significantly improve the quality of the generated image by suppressing the generation of low-quality samples.<n>SG relies on the sampling score function of the original diffusion or flow model at different noise levels.<n>We conduct extensive experiments on text-to-image and text-to-video generation with different architectures.
arXiv Detail & Related papers (2024-12-08T06:32:27Z)
Contrastive CFG: Improving CFG in Diffusion Models by Contrasting Positive and Negative Concepts [55.298031232672734]
As-Free Guidance (CFG) has proven effective in conditional diffusion model sampling for improved condition alignment. We present a novel method to enhance negative CFG guidance using contrastive loss.
arXiv Detail & Related papers (2024-11-26T03:29:27Z)
DifAugGAN: A Practical Diffusion-style Data Augmentation for GAN-based Single Image Super-resolution [88.13972071356422]
We propose a diffusion-style data augmentation scheme for GAN-based image super-resolution (SR) methods, known as DifAugGAN. It involves adapting the diffusion process in generative diffusion models for improving the calibration of the discriminator during training. Our DifAugGAN can be a Plug-and-Play strategy for current GAN-based SISR methods to improve the calibration of the discriminator and thus improve SR performance.
arXiv Detail & Related papers (2023-11-30T12:37:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.