Related papers: A Dense Reward View on Aligning Text-to-Image Diffusion with Preference

A Dense Reward View on Aligning Text-to-Image Diffusion with Preference

URL: http://arxiv.org/abs/2402.08265v2
Date: Sun, 12 May 2024 21:02:59 GMT
Title: A Dense Reward View on Aligning Text-to-Image Diffusion with Preference
Authors: Shentao Yang, Tianqi Chen, Mingyuan Zhou,
Abstract summary: We propose a tractable alignment objective that emphasizes the initial steps of the T2I reverse chain. In experiments on single and multiple prompt generation, our method is competitive with strong relevant baselines.
Score: 54.43177605637759
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Aligning text-to-image diffusion model (T2I) with preference has been gaining increasing research attention. While prior works exist on directly optimizing T2I by preference data, these methods are developed under the bandit assumption of a latent reward on the entire diffusion reverse chain, while ignoring the sequential nature of the generation process. This may harm the efficacy and efficiency of preference alignment. In this paper, we take on a finer dense reward perspective and derive a tractable alignment objective that emphasizes the initial steps of the T2I reverse chain. In particular, we introduce temporal discounting into DPO-style explicit-reward-free objectives, to break the temporal symmetry therein and suit the T2I generation hierarchy. In experiments on single and multiple prompt generation, our method is competitive with strong relevant baselines, both quantitatively and qualitatively. Further investigations are conducted to illustrate the insight of our approach.

Related papers

Enhancing Diffusion-based Unrestricted Adversarial Attacks via Adversary Preferences Alignment [26.95607772298534]
APA (Adversary Preferences Alignment) is a two-stage framework that decouples conflicting preferences and optimize each with differentiable rewards.<n> APA achieves significantly better attack transferability while maintaining high visual consistency, inspiring further research to approach adversarial attacks from an alignment perspective.
arXiv Detail & Related papers (2025-06-02T10:18:09Z)
Towards Dataset Copyright Evasion Attack against Personalized Text-to-Image Diffusion Models [52.877452505561706]
We propose the first copyright evasion attack specifically designed to undermine dataset ownership verification (DOV)<n>Our CEAT2I comprises three stages: watermarked sample detection, trigger identification, and efficient watermark mitigation.<n>Our experiments show that our CEAT2I effectively evades DOV mechanisms while preserving model performance.
arXiv Detail & Related papers (2025-05-05T17:51:55Z)
GenDR: Lightning Generative Detail Restorator [18.465568249533966]
We present a one-step diffusion model for generative detail restoration, GenDR, distilled from a tailored diffusion model with larger latent space. Experimental results demonstrate that GenDR achieves state-of-the-art performance in both quantitative metrics and visual fidelity.
arXiv Detail & Related papers (2025-03-09T22:02:18Z)
Reward-Guided Iterative Refinement in Diffusion Models at Test-Time with Applications to Protein and DNA Design [87.58981407469977]
We propose a novel framework for inference-time reward optimization with diffusion models inspired by evolutionary algorithms. Our approach employs an iterative refinement process consisting of two steps in each iteration: noising and reward-guided denoising.
arXiv Detail & Related papers (2025-02-20T17:48:45Z)
Privacy Protection in Personalized Diffusion Models via Targeted Cross-Attention Adversarial Attack [5.357486699062561]
We propose a novel and efficient adversarial attack method, Concept Protection by Selective Attention Manipulation (CoPSAM) For this purpose, we carefully construct an imperceptible noise to be added to clean samples to get their adversarial counterparts. Experimental validation on a subset of CelebA-HQ face images dataset demonstrates that our approach outperforms existing methods.
arXiv Detail & Related papers (2024-11-25T14:39:18Z)
Diffusion-RPO: Aligning Diffusion Models through Relative Preference Optimization [68.69203905664524]
We introduce Diffusion-RPO, a new method designed to align diffusion-based T2I models with human preferences more effectively. We have developed a new evaluation metric, style alignment, aimed at overcoming the challenges of high costs, low interpretability. Our findings demonstrate that Diffusion-RPO outperforms established methods such as Supervised Fine-Tuning and Diffusion-DPO in tuning Stable Diffusion versions 1.5 and XL-1.0.
arXiv Detail & Related papers (2024-06-10T15:42:03Z)
Direct Consistency Optimization for Compositional Text-to-Image Personalization [73.94505688626651]
Text-to-image (T2I) diffusion models, when fine-tuned on a few personal images, are able to generate visuals with a high degree of consistency. We propose to fine-tune the T2I model by maximizing consistency to reference images, while penalizing the deviation from the pretrained model.
arXiv Detail & Related papers (2024-02-19T09:52:41Z)
Harnessing Diffusion Models for Visual Perception with Meta Prompts [68.78938846041767]
We propose a simple yet effective scheme to harness a diffusion model for visual perception tasks. We introduce learnable embeddings (meta prompts) to the pre-trained diffusion models to extract proper features for perception. Our approach achieves new performance records in depth estimation tasks on NYU depth V2 and KITTI, and in semantic segmentation task on CityScapes.
arXiv Detail & Related papers (2023-12-22T14:40:55Z)
Separate-and-Enhance: Compositional Finetuning for Text2Image Diffusion Models [58.46926334842161]
This work illuminates the fundamental reasons for such misalignment, pinpointing issues related to low attention activation scores and mask overlaps. We propose two novel objectives, the Separate loss and the Enhance loss, that reduce object mask overlaps and maximize attention scores. Our method diverges from conventional test-time-adaptation techniques, focusing on finetuning critical parameters, which enhances scalability and generalizability.
arXiv Detail & Related papers (2023-12-10T22:07:42Z)
Beyond First-Order Tweedie: Solving Inverse Problems using Latent Diffusion [41.758635460235716]
We introduce Second-order Tweedie sampler from Surrogate Loss (STSL) STSL offers efficiency comparable to first-order Tweedie with a tractable reverse process using second-order approximation. Our method surpasses SoTA solvers PSLD and P2L, achieving 4X and 8X reduction in neural function evaluations.
arXiv Detail & Related papers (2023-12-01T14:36:24Z)
Debiasing the Cloze Task in Sequential Recommendation with Bidirectional Transformers [0.0]
We argue that Inverse Propensity Scoring (IPS) does not extend to sequential recommendation because it fails to account for the temporal nature of the problem. We then propose a novel propensity scoring mechanism, which can theoretically debias the Cloze task in sequential recommendation.
arXiv Detail & Related papers (2023-01-22T21:44:25Z)
Improving Crowded Object Detection via Copy-Paste [6.941267349187447]
Crowdedness caused by overlapping among similar objects is a ubiquitous challenge in the field of 2D visual object detection. We first underline two main effects of the crowdedness issue: 1) IoU-confidence correlation disturbances (ICD) and 2) confused de-duplication (CDD)
arXiv Detail & Related papers (2022-11-22T09:25:15Z)
Bias-Robust Bayesian Optimization via Dueling Bandit [57.82422045437126]
We consider Bayesian optimization in settings where observations can be adversarially biased. We propose a novel approach for dueling bandits based on information-directed sampling (IDS) Thereby, we obtain the first efficient kernelized algorithm for dueling bandits that comes with cumulative regret guarantees.
arXiv Detail & Related papers (2021-05-25T10:08:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.