Civiverse: A Dataset for Analyzing User Engagement with Open-Source Text-to-Image Models
- URL: http://arxiv.org/abs/2408.15261v1
- Date: Sat, 10 Aug 2024 21:41:03 GMT
- Title: Civiverse: A Dataset for Analyzing User Engagement with Open-Source Text-to-Image Models
- Authors: Maria-Teresa De Rosa Palmini, Laura Wagner, Eva Cetinic,
- Abstract summary: We analyze the Civiverse prompt dataset, encompassing millions of images and related metadata.
We focus on prompt analysis, specifically examining the semantic characteristics of text prompts.
Our findings reveal a predominant preference for generating explicit content, along with a focus on homogenization of semantic content.
- Score: 0.7209758868768352
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Text-to-image (TTI) systems, particularly those utilizing open-source frameworks, have become increasingly prevalent in the production of Artificial Intelligence (AI)-generated visuals. While existing literature has explored various problematic aspects of TTI technologies, such as bias in generated content, intellectual property concerns, and the reinforcement of harmful stereotypes, open-source TTI frameworks have not yet been systematically examined from a cultural perspective. This study addresses this gap by analyzing the CivitAI platform, a leading open-source platform dedicated to TTI AI. We introduce the Civiverse prompt dataset, encompassing millions of images and related metadata. We focus on prompt analysis, specifically examining the semantic characteristics of text prompts, as it is crucial for addressing societal issues related to generative technologies. This analysis provides insights into user intentions, preferences, and behaviors, which in turn shape the outputs of these models. Our findings reveal a predominant preference for generating explicit content, along with a focus on homogenization of semantic content. These insights underscore the need for further research into the perpetuation of misogyny, harmful stereotypes, and the uniformity of visual culture within these models.
Related papers
- VISTA: A Visual and Textual Attention Dataset for Interpreting Multimodal Models [2.0718016474717196]
integrated Vision and Language Models (VLMs) are frequently regarded as black boxes within the machine learning research community.
We present an image-text aligned human visual attention dataset that maps specific associations between image regions and corresponding text segments.
We then compare the internal heatmaps generated by VL models with this dataset, allowing us to analyze and better understand the model's decision-making process.
arXiv Detail & Related papers (2024-10-06T20:11:53Z) - See then Tell: Enhancing Key Information Extraction with Vision Grounding [54.061203106565706]
We introduce STNet (See then Tell Net), a novel end-to-end model designed to deliver precise answers with relevant vision grounding.
To enhance the model's seeing capabilities, we collect extensive structured table recognition datasets.
arXiv Detail & Related papers (2024-09-29T06:21:05Z) - Knowledge-Aware Reasoning over Multimodal Semi-structured Tables [85.24395216111462]
This study investigates whether current AI models can perform knowledge-aware reasoning on multimodal structured data.
We introduce MMTabQA, a new dataset designed for this purpose.
Our experiments highlight substantial challenges for current AI models in effectively integrating and interpreting multiple text and image inputs.
arXiv Detail & Related papers (2024-08-25T15:17:43Z) - A Survey on Personalized Content Synthesis with Diffusion Models [57.01364199734464]
PCS aims to customize the subject of interest to specific user-defined prompts.
Over the past two years, more than 150 methods have been proposed.
This paper offers a comprehensive survey of PCS, with a particular focus on the diffusion models.
arXiv Detail & Related papers (2024-05-09T04:36:04Z) - Explainable artificial intelligence approaches for brain-computer
interfaces: a review and design space [6.786321327136925]
This review paper provides an integrated perspective of Explainable Artificial Intelligence techniques applied to Brain-Computer Interfaces.
Brain-Computer Interfaces use predictive models to interpret brain signals for various high-stake applications.
There is a lack of an integrated perspective in XAI for BCI literature.
arXiv Detail & Related papers (2023-12-20T13:56:31Z) - Language Agents for Detecting Implicit Stereotypes in Text-to-image
Models at Scale [45.64096601242646]
We introduce a novel agent architecture tailored for stereotype detection in text-to-image models.
We build the stereotype-relevant benchmark based on multiple open-text datasets.
We find that these models often display serious stereotypes when it comes to certain prompts about personal characteristics.
arXiv Detail & Related papers (2023-10-18T08:16:29Z) - Foundational Models Defining a New Era in Vision: A Survey and Outlook [151.49434496615427]
Vision systems to see and reason about the compositional nature of visual scenes are fundamental to understanding our world.
The models learned to bridge the gap between such modalities coupled with large-scale training data facilitate contextual reasoning, generalization, and prompt capabilities at test time.
The output of such models can be modified through human-provided prompts without retraining, e.g., segmenting a particular object by providing a bounding box, having interactive dialogues by asking questions about an image or video scene or manipulating the robot's behavior through language instructions.
arXiv Detail & Related papers (2023-07-25T17:59:18Z) - Knowledge Graph Augmented Network Towards Multiview Representation
Learning for Aspect-based Sentiment Analysis [96.53859361560505]
We propose a knowledge graph augmented network (KGAN) to incorporate external knowledge with explicitly syntactic and contextual information.
KGAN captures the sentiment feature representations from multiple perspectives, i.e., context-, syntax- and knowledge-based.
Experiments on three popular ABSA benchmarks demonstrate the effectiveness and robustness of our KGAN.
arXiv Detail & Related papers (2022-01-13T08:25:53Z) - Out of Context: A New Clue for Context Modeling of Aspect-based
Sentiment Analysis [54.735400754548635]
ABSA aims to predict the sentiment expressed in a review with respect to a given aspect.
The given aspect should be considered as a new clue out of context in the context modeling process.
We design several aspect-aware context encoders based on different backbones.
arXiv Detail & Related papers (2021-06-21T02:26:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.