PromptLA: Towards Integrity Verification of Black-box Text-to-Image Diffusion Models
- URL: http://arxiv.org/abs/2412.16257v2
- Date: Fri, 28 Mar 2025 08:39:46 GMT
- Title: PromptLA: Towards Integrity Verification of Black-box Text-to-Image Diffusion Models
- Authors: Zhuomeng Zhang, Fangqi Li, Chong Di, Hongyu Zhu, Hanyi Wang, Shilin Wang,
- Abstract summary: Malicious actors can fine-tune text-to-image (T2I) diffusion models to generate illegal content.<n>We propose a novel prompt selection algorithm based on learning automaton (PromptLA) for efficient and accurate verification.
- Score: 17.12906933388337
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite the impressive synthesis quality of text-to-image (T2I) diffusion models, their black-box deployment poses significant regulatory challenges: Malicious actors can fine-tune these models to generate illegal content, circumventing existing safeguards through parameter manipulation. Therefore, it is essential to verify the integrity of T2I diffusion models. To this end, considering the randomness within the outputs of generative models and the high costs in interacting with them, we discern model tampering via the KL divergence between the distributions of the features of generated images. We propose a novel prompt selection algorithm based on learning automaton (PromptLA) for efficient and accurate verification. Evaluations on four advanced T2I models (e.g., SDXL, FLUX.1) demonstrate that our method achieves a mean AUC of over 0.96 in integrity detection, exceeding baselines by more than 0.2, showcasing strong effectiveness and generalization. Additionally, our approach achieves lower cost and is robust against image-level post-processing. To the best of our knowledge, this paper is the first work addressing the integrity verification of T2I diffusion models, which establishes quantifiable standards for AI copyright litigation in practice.
Related papers
- D2C: Unlocking the Potential of Continuous Autoregressive Image Generation with Discrete Tokens [80.75893450536577]
We propose D2C, a novel two-stage method to enhance model generation capacity.
In the first stage, the discrete-valued tokens representing coarse-grained image features are sampled by employing a small discrete-valued generator.
In the second stage, the continuous-valued tokens representing fine-grained image features are learned conditioned on the discrete token sequence.
arXiv Detail & Related papers (2025-03-21T13:58:49Z) - AEIOU: A Unified Defense Framework against NSFW Prompts in Text-to-Image Models [39.11841245506388]
Malicious users often exploit text-to-image (T2I) models to generate Not-Safe-for-Work (NSFW) images.
We introduce AEIOU, a framework that is Adaptable, Efficient, Interpretable, Optimizable, and Unified against NSFW prompts in T2I models.
arXiv Detail & Related papers (2024-12-24T03:17:45Z) - SleeperMark: Towards Robust Watermark against Fine-Tuning Text-to-image Diffusion Models [77.80595722480074]
SleeperMark is a novel framework designed to embed resilient watermarks into T2I diffusion models.<n>It guides the model to disentangle the watermark information from the semantic concepts it learns, allowing the model to retain the embedded watermark.<n>Our experiments demonstrate the effectiveness of SleeperMark across various types of diffusion models.
arXiv Detail & Related papers (2024-12-06T08:44:18Z) - Towards Understanding and Quantifying Uncertainty for Text-to-Image Generation [4.1364578693016325]
Uncertainty quantification in text-to-image (T2I) generative models is crucial for understanding model behavior and improving output reliability.
We are the first to quantify and evaluate the uncertainty of T2I models with respect to the prompt.
We introduce Prompt-based UNCertainty Estimation for T2I models (PUNC) to better address uncertainties arising from the semantics of the prompt and generated images.
arXiv Detail & Related papers (2024-12-04T10:03:52Z) - Model Integrity when Unlearning with T2I Diffusion Models [11.321968363411145]
We propose approximate Machine Unlearning algorithms to reduce the generation of specific types of images, characterized by samples from a forget distribution''
We then propose unlearning algorithms that demonstrate superior effectiveness in preserving model integrity compared to existing baselines.
arXiv Detail & Related papers (2024-11-04T13:15:28Z) - Improving Consistency in Diffusion Models for Image Super-Resolution [28.945663118445037]
We observe two kinds of inconsistencies in diffusion-based methods.
We introduce ConsisSR to handle both semantic and training-inference consistencies.
Our method demonstrates state-of-the-art performance among existing diffusion models.
arXiv Detail & Related papers (2024-10-17T17:41:52Z) - Towards Effective User Attribution for Latent Diffusion Models via Watermark-Informed Blending [54.26862913139299]
We introduce a novel framework Towards Effective user Attribution for latent diffusion models via Watermark-Informed Blending (TEAWIB)<n> TEAWIB incorporates a unique ready-to-use configuration approach that allows seamless integration of user-specific watermarks into generative models.<n>Experiments validate the effectiveness of TEAWIB, showcasing the state-of-the-art performance in perceptual quality and attribution accuracy.
arXiv Detail & Related papers (2024-09-17T07:52:09Z) - Elucidating Optimal Reward-Diversity Tradeoffs in Text-to-Image Diffusion Models [20.70550870149442]
We introduce Annealed Importance Guidance (AIG), an inference-time regularization inspired by Annealed Importance Sampling.
Our experiments demonstrate the benefits of AIG for Stable Diffusion models, striking the optimal balance between reward optimization and image diversity.
arXiv Detail & Related papers (2024-09-09T16:27:26Z) - Not Every Image is Worth a Thousand Words: Quantifying Originality in Stable Diffusion [21.252145402613472]
This work addresses the challenge of quantifying originality in text-to-image (T2I) generative diffusion models.
We propose a method that leverages textual inversion to measure the originality of an image based on the number of tokens required for its reconstruction by the model.
arXiv Detail & Related papers (2024-08-15T14:42:02Z) - Improving Text-to-Image Consistency via Automatic Prompt Optimization [26.2587505265501]
We introduce a T2I optimization-by-prompting framework, OPT2I, to improve prompt-image consistency in T2I models.
Our framework starts from a user prompt and iteratively generates revised prompts with the goal of maximizing a consistency score.
arXiv Detail & Related papers (2024-03-26T15:42:01Z) - Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation [59.184980778643464]
Fine-tuning Diffusion Models remains an underexplored frontier in generative artificial intelligence (GenAI)
In this paper, we introduce an innovative technique called self-play fine-tuning for diffusion models (SPIN-Diffusion)
Our approach offers an alternative to conventional supervised fine-tuning and RL strategies, significantly improving both model performance and alignment.
arXiv Detail & Related papers (2024-02-15T18:59:18Z) - Harnessing Diffusion Models for Visual Perception with Meta Prompts [68.78938846041767]
We propose a simple yet effective scheme to harness a diffusion model for visual perception tasks.
We introduce learnable embeddings (meta prompts) to the pre-trained diffusion models to extract proper features for perception.
Our approach achieves new performance records in depth estimation tasks on NYU depth V2 and KITTI, and in semantic segmentation task on CityScapes.
arXiv Detail & Related papers (2023-12-22T14:40:55Z) - Separate-and-Enhance: Compositional Finetuning for Text2Image Diffusion
Models [58.46926334842161]
This work illuminates the fundamental reasons for such misalignment, pinpointing issues related to low attention activation scores and mask overlaps.
We propose two novel objectives, the Separate loss and the Enhance loss, that reduce object mask overlaps and maximize attention scores.
Our method diverges from conventional test-time-adaptation techniques, focusing on finetuning critical parameters, which enhances scalability and generalizability.
arXiv Detail & Related papers (2023-12-10T22:07:42Z) - R&B: Region and Boundary Aware Zero-shot Grounded Text-to-image
Generation [74.5598315066249]
We probe into zero-shot grounded T2I generation with diffusion models.
We propose a Region and Boundary (R&B) aware cross-attention guidance approach.
arXiv Detail & Related papers (2023-10-13T05:48:42Z) - WOUAF: Weight Modulation for User Attribution and Fingerprinting in Text-to-Image Diffusion Models [32.29120988096214]
This paper introduces a novel approach to model fingerprinting that assigns responsibility for the generated images.
Our method modifies generative models based on each user's unique digital fingerprint, imprinting a unique identifier onto the resultant content that can be traced back to the user.
arXiv Detail & Related papers (2023-06-07T19:44:14Z) - SinDiffusion: Learning a Diffusion Model from a Single Natural Image [159.4285444680301]
We present SinDiffusion, leveraging denoising diffusion models to capture internal distribution of patches from a single natural image.
It is based on two core designs. First, SinDiffusion is trained with a single model at a single scale instead of multiple models with progressive growing of scales.
Second, we identify that a patch-level receptive field of the diffusion network is crucial and effective for capturing the image's patch statistics.
arXiv Detail & Related papers (2022-11-22T18:00:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.