Diffusion-SDPO: Safeguarded Direct Preference Optimization for Diffusion Models
- URL: http://arxiv.org/abs/2511.03317v1
- Date: Wed, 05 Nov 2025 09:30:49 GMT
- Title: Diffusion-SDPO: Safeguarded Direct Preference Optimization for Diffusion Models
- Authors: Minghao Fu, Guo-Hua Wang, Tianyu Cui, Qing-Guo Chen, Zhao Xu, Weihua Luo, Kaifu Zhang,
- Abstract summary: Diffusion-SDPO is a safeguarded update rule that preserves the winner by adaptively scaling the loser gradient according to its alignment with the winner gradient.<n>Our method is simple, model-agnostic, broadly compatible with existing DPO-style alignment frameworks and adds only marginal computational overhead.
- Score: 38.27881260102189
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Text-to-image diffusion models deliver high-quality images, yet aligning them with human preferences remains challenging. We revisit diffusion-based Direct Preference Optimization (DPO) for these models and identify a critical pathology: enlarging the preference margin does not necessarily improve generation quality. In particular, the standard Diffusion-DPO objective can increase the reconstruction error of both winner and loser branches. Consequently, degradation of the less-preferred outputs can become sufficiently severe that the preferred branch is also adversely affected even as the margin grows. To address this, we introduce Diffusion-SDPO, a safeguarded update rule that preserves the winner by adaptively scaling the loser gradient according to its alignment with the winner gradient. A first-order analysis yields a closed-form scaling coefficient that guarantees the error of the preferred output is non-increasing at each optimization step. Our method is simple, model-agnostic, broadly compatible with existing DPO-style alignment frameworks and adds only marginal computational overhead. Across standard text-to-image benchmarks, Diffusion-SDPO delivers consistent gains over preference-learning baselines on automated preference, aesthetic, and prompt alignment metrics. Code is publicly available at https://github.com/AIDC-AI/Diffusion-SDPO.
Related papers
- Region-Normalized DPO for Medical Image Segmentation under Noisy Judges [7.10111238784554]
Region-Normalized DPO is a segmentation-aware objective which normalizes preference updates by the size of the disagreement region between masks.<n>It stabilizes preference-based fine-tuning, outperforming standard DPO and strong baselines without requiring additional pixel annotations.
arXiv Detail & Related papers (2026-01-30T17:45:53Z) - AMaPO: Adaptive Margin-attached Preference Optimization for Language Model Alignment [25.526336903358757]
offline preference optimization offers a simpler and more stable alternative to RLHF for aligning language models.<n>We propose Adaptive Margin-attached Preference Optimization (AMaPO), a simple yet principled algorithm.<n>AMaPO employs an instance-wise adaptive margin, refined by Z-normalization and exponential scaling, which dynamically reallocates learning effort by amplifying gradients for misranked samples and suppressing them for correct ones.
arXiv Detail & Related papers (2025-11-12T14:51:59Z) - PC-Diffusion: Aligning Diffusion Models with Human Preferences via Preference Classifier [36.21450058652141]
We propose a novel framework for human preference alignment in diffusion models (PC-Diffusion)<n>PC-Diffusion uses a lightweight, trainable Preference that directly models the relative preference between samples.<n>We show that PC-Diffusion achieves comparable preference consistency to DPO while significantly reducing training costs and enabling efficient preference-guided generation.
arXiv Detail & Related papers (2025-11-11T03:53:06Z) - Finetuning-Free Personalization of Text to Image Generation via Hypernetworks [15.129799519953139]
We introduce fine-tuning-free personalization via Hypernetworks that predict LoRA-adapted weights directly from subject images.<n>Our approach achieves strong personalization performance and highlights the promise of hypernetworks as a scalable and effective direction for open-category personalization.
arXiv Detail & Related papers (2025-11-05T03:31:33Z) - Margin Adaptive DPO: Leveraging Reward Model for Granular Control in Preference Optimization [0.0]
Margin-Adaptive Direct Preference Optimization provides a stable, data-preserving, and instance-level solution.<n>We provide a comprehensive theoretical analysis, proving that MADPO has a well-behaved optimization landscape.<n>It achieves performance gains of up to +33.3% on High Quality data and +10.5% on Low Quality data over the next-best method.
arXiv Detail & Related papers (2025-10-06T20:09:37Z) - Smoothed Preference Optimization via ReNoise Inversion for Aligning Diffusion Models with Varied Human Preferences [13.588231827053923]
Direct Preference Optimization (DPO) aligns text-to-image (T2I) generation models with human preferences using pairwise preference data.<n>We propose SmPO-Diffusion, a novel method for modeling preference distributions to improve the DPO objective.<n>Our approach effectively mitigates issues of excessive optimization and objective misalignment present in existing methods.
arXiv Detail & Related papers (2025-06-03T09:47:22Z) - Self-NPO: Negative Preference Optimization of Diffusion Models by Simply Learning from Itself without Explicit Preference Annotations [60.143658714894336]
Diffusion models have demonstrated remarkable success in various visual generation tasks, including image, video, and 3D content generation.<n> Preference optimization (PO) is a prominent and growing area of research that aims to align these models with human preferences.<n>We introduce Self-NPO, a Negative Preference Optimization approach that learns exclusively from the model itself.
arXiv Detail & Related papers (2025-05-17T01:03:46Z) - SUDO: Enhancing Text-to-Image Diffusion Models with Self-Supervised Direct Preference Optimization [19.087540230261684]
Previous text-to-image diffusion models typically employ supervised fine-tuning to enhance pre-trained base models.<n>We introduce Self-sUpervised Direct preference Optimization (SUDO), a novel paradigm that optimize both fine-grained details at the pixel level and global image quality.<n>As an effective alternative to supervised fine-tuning, SUDO can be seamlessly applied to any text-to-image diffusion model.
arXiv Detail & Related papers (2025-04-20T08:18:27Z) - Refining Alignment Framework for Diffusion Models with Intermediate-Step Preference Ranking [50.325021634589596]
We propose a Tailored Optimization Preference (TailorPO) framework for aligning diffusion models with human preference.<n>Our approach directly ranks intermediate noisy samples based on their step-wise reward, and effectively resolves the gradient direction issues.<n> Experimental results demonstrate that our method significantly improves the model's ability to generate aesthetically pleasing and human-preferred images.
arXiv Detail & Related papers (2025-02-01T16:08:43Z) - Uncertainty-Penalized Direct Preference Optimization [52.387088396044206]
We develop a pessimistic framework for DPO by introducing preference uncertainty penalization schemes.
The penalization serves as a correction to the loss which attenuates the loss gradient for uncertain samples.
We show improved overall performance compared to vanilla DPO, as well as better completions on prompts from high-uncertainty chosen/rejected responses.
arXiv Detail & Related papers (2024-10-26T14:24:37Z) - Model Inversion Attacks Through Target-Specific Conditional Diffusion Models [54.69008212790426]
Model inversion attacks (MIAs) aim to reconstruct private images from a target classifier's training set, thereby raising privacy concerns in AI applications.
Previous GAN-based MIAs tend to suffer from inferior generative fidelity due to GAN's inherent flaws and biased optimization within latent space.
We propose Diffusion-based Model Inversion (Diff-MI) attacks to alleviate these issues.
arXiv Detail & Related papers (2024-07-16T06:38:49Z) - Diffusion Model Alignment Using Direct Preference Optimization [103.2238655827797]
Diffusion-DPO is a method to align diffusion models to human preferences by directly optimizing on human comparison data.
We fine-tune the base model of the state-of-the-art Stable Diffusion XL (SDXL)-1.0 model with Diffusion-DPO.
We also develop a variant that uses AI feedback and has comparable performance to training on human preferences.
arXiv Detail & Related papers (2023-11-21T15:24:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.