DETONATE: A Benchmark for Text-to-Image Alignment and Kernelized Direct Preference Optimization
- URL: http://arxiv.org/abs/2506.14903v1
- Date: Tue, 17 Jun 2025 18:17:35 GMT
- Title: DETONATE: A Benchmark for Text-to-Image Alignment and Kernelized Direct Preference Optimization
- Authors: Renjith Prasad, Abhilekh Borah, Hasnat Md Abdullah, Chathurangi Shyalika, Gurpreet Singh, Ritvik Garimella, Rajarshi Roy, Harshul Surana, Nasrin Imanpour, Suranjana Trivedy, Amit Sheth, Amitava Das,
- Abstract summary: This paper introduces DPO- Kernels for text-to-image (T2I) models, a novel extension enhancing alignment across three dimensions.<n>We introduce DETONATE, the first large-scale benchmark of its kind, comprising approximately 100K curated image pairs.<n>We also propose the Alignment Quality Index (AQI), a novel geometric measure of latent-space separability of safe/unsafe image activations.
- Score: 4.496316647545223
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Alignment is crucial for text-to-image (T2I) models to ensure that generated images faithfully capture user intent while maintaining safety and fairness. Direct Preference Optimization (DPO), prominent in large language models (LLMs), is extending its influence to T2I systems. This paper introduces DPO-Kernels for T2I models, a novel extension enhancing alignment across three dimensions: (i) Hybrid Loss, integrating embedding-based objectives with traditional probability-based loss for improved optimization; (ii) Kernelized Representations, employing Radial Basis Function (RBF), Polynomial, and Wavelet kernels for richer feature transformations and better separation between safe and unsafe inputs; and (iii) Divergence Selection, expanding beyond DPO's default Kullback-Leibler (KL) regularizer by incorporating Wasserstein and R'enyi divergences for enhanced stability and robustness. We introduce DETONATE, the first large-scale benchmark of its kind, comprising approximately 100K curated image pairs categorized as chosen and rejected. DETONATE encapsulates three axes of social bias and discrimination: Race, Gender, and Disability. Prompts are sourced from hate speech datasets, with images generated by leading T2I models including Stable Diffusion 3.5 Large, Stable Diffusion XL, and Midjourney. Additionally, we propose the Alignment Quality Index (AQI), a novel geometric measure quantifying latent-space separability of safe/unsafe image activations, revealing hidden vulnerabilities. Empirically, we demonstrate that DPO-Kernels maintain strong generalization bounds via Heavy-Tailed Self-Regularization (HT-SR). DETONATE and complete code are publicly released.
Related papers
- Leveraging Vision-Language Models to Select Trustworthy Super-Resolution Samples Generated by Diffusion Models [0.026861992804651083]
This paper introduces a robust framework for identifying the most trustworthy SR sample from a diffusion-generated set.<n>We propose a novel Trustworthiness Score (TWS) a hybrid metric that quantifies SR reliability based on semantic similarity.<n>By aligning outputs with human expectations and semantic correctness, this work sets a new benchmark for trustworthiness in generative SR.
arXiv Detail & Related papers (2025-06-25T21:00:44Z) - DSwinIR: Rethinking Window-based Attention for Image Restoration [109.38288333994407]
We propose the Deformable Sliding Window Transformer (DSwinIR) as a new foundational backbone architecture for image restoration.<n>At the heart of DSwinIR is the proposed novel Deformable Sliding Window (DSwin) Attention.<n>Extensive experiments show that DSwinIR sets a new state-of-the-art across a wide spectrum of image restoration tasks.
arXiv Detail & Related papers (2025-04-07T09:24:41Z) - FUSE: Label-Free Image-Event Joint Monocular Depth Estimation via Frequency-Decoupled Alignment and Degradation-Robust Fusion [63.87313550399871]
Image-event joint depth estimation methods leverage complementary modalities for robust perception, yet face challenges in generalizability.<n>We propose Self-supervised Transfer (PST) and FrequencyDe-coupled Fusion module (FreDF)<n>PST establishes cross-modal knowledge transfer through latent space alignment with image foundation models.<n>FreDF explicitly decouples high-frequency edge features from low-frequency structural components, resolving modality-specific frequency mismatches.
arXiv Detail & Related papers (2025-03-25T15:04:53Z) - YINYANG-ALIGN: Benchmarking Contradictory Objectives and Proposing Multi-Objective Optimization based DPO for Text-to-Image Alignment [6.120756739633247]
YinYangAlign is a framework that systematically quantifies the alignment fidelity of Text-to-Image (T2I) systems.<n>It addresses six fundamental and inherently contradictory design objectives.<n> YinYangAlign includes detailed datasets featuring human prompts, aligned (chosen) responses, misaligned (rejected) AI-generated outputs, and explanations of the underlying contradictions.
arXiv Detail & Related papers (2025-02-05T18:46:20Z) - DPO Kernels: A Semantically-Aware, Kernel-Enhanced, and Divergence-Rich Paradigm for Direct Preference Optimization [6.303144414273044]
Large language models (LLMs) have unlocked many applications but also underscores the challenge of aligning them with diverse values and preferences.<n>Direct Preference Optimization (DPO) is central to alignment but constrained by fixed divergences and limited feature transformations.
arXiv Detail & Related papers (2025-01-05T00:08:52Z) - PromptLA: Towards Integrity Verification of Black-box Text-to-Image Diffusion Models [17.12906933388337]
Malicious actors can fine-tune text-to-image (T2I) diffusion models to generate illegal content.<n>We propose a novel prompt selection algorithm based on learning automaton (PromptLA) for efficient and accurate verification.
arXiv Detail & Related papers (2024-12-20T07:24:32Z) - SafetyDPO: Scalable Safety Alignment for Text-to-Image Generation [68.07258248467309]
Text-to-image (T2I) models have become widespread, but their limited safety guardrails expose end users to harmful content and potentially allow for model misuse.<n>Current safety measures are typically limited to text-based filtering or concept removal strategies, able to remove just a few concepts from the model's generative capabilities.<n>We introduce SafetyDPO, a method for safety alignment of T2I models through Direct Preference Optimization (DPO)<n>We train safety experts, in the form of low-rank adaptation (LoRA) matrices, able to guide the generation process away from specific safety-related
arXiv Detail & Related papers (2024-12-13T18:59:52Z) - Binarized Diffusion Model for Image Super-Resolution [61.963833405167875]
Binarization, an ultra-compression algorithm, offers the potential for effectively accelerating advanced diffusion models (DMs)
Existing binarization methods result in significant performance degradation.
We introduce a novel binarized diffusion model, BI-DiffSR, for image SR.
arXiv Detail & Related papers (2024-06-09T10:30:25Z) - One-Shot Safety Alignment for Large Language Models via Optimal Dualization [64.52223677468861]
This paper presents a perspective of dualization that reduces constrained alignment to an equivalent unconstrained alignment problem.
We do so by pre-optimizing a smooth and convex dual function that has a closed form.
Our strategy leads to two practical algorithms in model-based and preference-based settings.
arXiv Detail & Related papers (2024-05-29T22:12:52Z) - Noisy-Correspondence Learning for Text-to-Image Person Re-identification [50.07634676709067]
We propose a novel Robust Dual Embedding method (RDE) to learn robust visual-semantic associations even with noisy correspondences.
Our method achieves state-of-the-art results both with and without synthetic noisy correspondences on three datasets.
arXiv Detail & Related papers (2023-08-19T05:34:13Z) - Uncertainty-Aware Source-Free Adaptive Image Super-Resolution with Wavelet Augmentation Transformer [60.31021888394358]
Unsupervised Domain Adaptation (UDA) can effectively address domain gap issues in real-world image Super-Resolution (SR)
We propose a SOurce-free Domain Adaptation framework for image SR (SODA-SR) to address this issue, i.e., adapt a source-trained model to a target domain with only unlabeled target data.
arXiv Detail & Related papers (2023-03-31T03:14:44Z) - A Generalized Kernel Risk Sensitive Loss for Robust Two-Dimensional
Singular Value Decomposition [11.234115388848283]
Two-dimensional singular decomposition (2DSVD) has been widely used for image processing tasks, such as image reconstruction, classification, and clustering.
Traditional 2DSVD is based on the mean square error (MSE) loss, which is sensitive to outliers.
We propose a robustDSVD based on a generalized kernel risk of noise and outliers.
arXiv Detail & Related papers (2020-05-10T14:02:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.