Trustworthy Text-to-Image Diffusion Models: A Timely and Focused Survey
- URL: http://arxiv.org/abs/2409.18214v1
- Date: Thu, 26 Sep 2024 18:46:47 GMT
- Title: Trustworthy Text-to-Image Diffusion Models: A Timely and Focused Survey
- Authors: Yi Zhang, Zhen Chen, Chih-Hong Cheng, Wenjie Ruan, Xiaowei Huang, Dezong Zhao, David Flynn, Siddartha Khastgir, Xingyu Zhao,
- Abstract summary: Text-to-Image (T2I) Diffusion Models (DMs) have garnered widespread attention for their impressive advancements in image generation.
Their growing popularity has raised ethical and social concerns related to key non-functional properties of trustworthiness.
- Score: 22.930713650452894
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Text-to-Image (T2I) Diffusion Models (DMs) have garnered widespread attention for their impressive advancements in image generation. However, their growing popularity has raised ethical and social concerns related to key non-functional properties of trustworthiness, such as robustness, fairness, security, privacy, factuality, and explainability, similar to those in traditional deep learning (DL) tasks. Conventional approaches for studying trustworthiness in DL tasks often fall short due to the unique characteristics of T2I DMs, e.g., the multi-modal nature. Given the challenge, recent efforts have been made to develop new methods for investigating trustworthiness in T2I DMs via various means, including falsification, enhancement, verification \& validation and assessment. However, there is a notable lack of in-depth analysis concerning those non-functional properties and means. In this survey, we provide a timely and focused review of the literature on trustworthy T2I DMs, covering a concise-structured taxonomy from the perspectives of property, means, benchmarks and applications. Our review begins with an introduction to essential preliminaries of T2I DMs, and then we summarise key definitions/metrics specific to T2I tasks and analyses the means proposed in recent literature based on these definitions/metrics. Additionally, we review benchmarks and domain applications of T2I DMs. Finally, we highlight the gaps in current research, discuss the limitations of existing methods, and propose future research directions to advance the development of trustworthy T2I DMs. Furthermore, we keep up-to-date updates in this field to track the latest developments and maintain our GitHub repository at: https://github.com/wellzline/Trustworthy_T2I_DMs
Related papers
- MinorityPrompt: Text to Minority Image Generation via Prompt Optimization [57.319845580050924]
We investigate the generation of minority samples using pretrained text-to-image (T2I) latent diffusion models.
We develop an online prompt optimization framework that can encourage the emergence of desired properties.
We then tailor this generic prompt into a specialized solver that promotes the generation of minority features.
arXiv Detail & Related papers (2024-10-10T11:56:09Z) - FAIntbench: A Holistic and Precise Benchmark for Bias Evaluation in Text-to-Image Models [7.30796695035169]
FAIntbench is a holistic and precise benchmark for biases in Text-to-Image (T2I) models.
We applied FAIntbench to evaluate seven recent large-scale T2I models and conducted human evaluation.
Results demonstrated the effectiveness of FAIntbench in identifying various biases.
arXiv Detail & Related papers (2024-05-28T04:18:00Z) - Who Evaluates the Evaluations? Objectively Scoring Text-to-Image Prompt Coherence Metrics with T2IScoreScore (TS2) [62.44395685571094]
We introduce T2IScoreScore, a curated set of semantic error graphs containing a prompt and a set of increasingly erroneous images.
These allow us to rigorously judge whether a given prompt faithfulness metric can correctly order images with respect to their objective error count.
We find that the state-of-the-art VLM-based metrics fail to significantly outperform simple (and supposedly worse) feature-based metrics like CLIPScore.
arXiv Detail & Related papers (2024-04-05T17:57:16Z) - ProTIP: Probabilistic Robustness Verification on Text-to-Image Diffusion Models against Stochastic Perturbation [18.103478658038846]
Text-to-Image (T2I) Diffusion Models (DMs) have shown impressive abilities in generating high-quality images based on simple text descriptions.
As is common with many Deep Learning (DL) models, DMs are subject to a lack of robustness.
We introduce a probabilistic notion of T2I DMs' robustness; and then establish an efficient framework, ProTIP, to evaluate it with statistical guarantees.
arXiv Detail & Related papers (2024-02-23T16:48:56Z) - Memory in Plain Sight: Surveying the Uncanny Resemblances of Associative Memories and Diffusion Models [65.08133391009838]
generative process of Diffusion Models (DMs) has recently set state-of-the-art on many AI generation benchmarks.
We introduce a novel perspective to describe DMs using the mathematical language of memory retrieval from the field of energy-based Associative Memories (AMs)
We present a growing body of evidence that records DMs exhibiting empirical behavior we would expect from AMs, and conclude by discussing research opportunities that are revealed by understanding DMs as a form of energy-based memory.
arXiv Detail & Related papers (2023-09-28T17:57:09Z) - Evaluating the Robustness of Text-to-image Diffusion Models against
Real-world Attacks [22.651626059348356]
Text-to-image (T2I) diffusion models (DMs) have shown promise in generating high-quality images from textual descriptions.
One fundamental question is whether existing T2I DMs are robust against variations over input texts.
This work provides the first robustness evaluation of T2I DMs against real-world attacks.
arXiv Detail & Related papers (2023-06-16T00:43:35Z) - Measuring the Robustness of NLP Models to Domain Shifts [50.89876374569385]
Existing research on Domain Robustness (DR) suffers from disparate setups, limited task variety, and scarce research on recent capabilities such as in-context learning.
Current research focuses on challenge sets and relies solely on the Source Drop (SD): Using the source in-domain performance as a reference point for degradation.
We argue that the Target Drop (TD), which measures degradation from the target in-domain performance, should be used as a complementary point of view.
arXiv Detail & Related papers (2023-05-31T20:25:08Z) - Artificial Intelligence-Based Methods for Precision Medicine: Diabetes
Risk Prediction [0.3425341633647624]
This scoping review analyzes existing literature on AI-based models for T2DM risk prediction.
Traditional machine learning models were more prevalent than deep learning models.
Both unimodal and multimodal models showed promising performance, with the latter outperforming the former.
arXiv Detail & Related papers (2023-05-24T14:45:54Z) - What Makes Data-to-Text Generation Hard for Pretrained Language Models? [17.07349898176898]
Expressing natural language descriptions of structured facts or relations -- data-to-text generation (D2T) -- increases the accessibility of structured knowledge repositories.
Previous work shows that pre-trained language models(PLMs) perform remarkably well on this task after fine-tuning on a significant amount of task-specific training data.
We conduct an empirical study of both fine-tuned and auto-regressive PLMs on the DART multi-domain D2T dataset.
arXiv Detail & Related papers (2022-05-23T17:58:39Z) - Adversarial Robustness under Long-Tailed Distribution [93.50792075460336]
Adversarial robustness has attracted extensive studies recently by revealing the vulnerability and intrinsic characteristics of deep networks.
In this work we investigate the adversarial vulnerability as well as defense under long-tailed distributions.
We propose a clean yet effective framework, RoBal, which consists of two dedicated modules, a scale-invariant and data re-balancing.
arXiv Detail & Related papers (2021-04-06T17:53:08Z) - SupMMD: A Sentence Importance Model for Extractive Summarization using
Maximum Mean Discrepancy [92.5683788430012]
SupMMD is a novel technique for generic and update summarization based on the maximum discrepancy from kernel two-sample testing.
We show the efficacy of SupMMD in both generic and update summarization tasks by meeting or exceeding the current state-of-the-art on the DUC-2004 and TAC-2009 datasets.
arXiv Detail & Related papers (2020-10-06T09:26:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.