Related papers: T-HITL Effectively Addresses Problematic Associations in Image Generation and Maintains Overall Visual Quality

T-HITL Effectively Addresses Problematic Associations in Image Generation and Maintains Overall Visual Quality

URL: http://arxiv.org/abs/2402.17101v1
Date: Tue, 27 Feb 2024 00:29:33 GMT
Title: T-HITL Effectively Addresses Problematic Associations in Image Generation and Maintains Overall Visual Quality
Authors: Susan Epstein, Li Chen, Alessandro Vecchiato, Ankit Jain
Abstract summary: We focus on addressing the generation of problematic associations between demographic groups and semantic concepts. We propose a new methodology with twice-human-in-the-loop (T-HITL) that promises improvements in both reducing problematic associations and also maintaining visual quality.
Score: 52.5529784801908
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Generative AI image models may inadvertently generate problematic representations of people. Past research has noted that millions of users engage daily across the world with these models and that the models, including through problematic representations of people, have the potential to compound and accelerate real-world discrimination and other harms (Bianchi et al, 2023). In this paper, we focus on addressing the generation of problematic associations between demographic groups and semantic concepts that may reflect and reinforce negative narratives embedded in social data. Building on sociological literature (Blumer, 1958) and mapping representations to model behaviors, we have developed a taxonomy to study problematic associations in image generation models. We explore the effectiveness of fine tuning at the model level as a method to address these associations, identifying a potential reduction in visual quality as a limitation of traditional fine tuning. We also propose a new methodology with twice-human-in-the-loop (T-HITL) that promises improvements in both reducing problematic associations and also maintaining visual quality. We demonstrate the effectiveness of T-HITL by providing evidence of three problematic associations addressed by T-HITL at the model level. Our contributions to scholarship are two-fold. By defining problematic associations in the context of machine learning models and generative AI, we introduce a conceptual and technical taxonomy for addressing some of these associations. Finally, we provide a method, T-HITL, that addresses these associations and simultaneously maintains visual quality of image model generations. This mitigation need not be a tradeoff, but rather an enhancement.

Related papers

Multi-Group Proportional Representation for Text-to-Image Models [19.36512604668349]
Text-to-image (T2I) generative models can create vivid, realistic images from textual descriptions.<n>As these models proliferate, they expose new concerns about their ability to represent diverse demographic groups, propagate stereotypes, and efface minority populations.<n>This paper introduces a novel framework to measure the representation of intersectional groups in images generated by T2I models by applying the Multi-Group Proportional Representation metric.
arXiv Detail & Related papers (2025-05-29T21:48:28Z)
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities [22.476740954286836]
We present a comprehensive survey aimed at guiding future research.<n>We review existing unified models, categorizing them into three main architectural paradigms.<n>We discuss the key challenges facing this nascent field, including tokenization strategy, cross-modal attention, and data.
arXiv Detail & Related papers (2025-05-05T11:18:03Z)
Boosting Alignment for Post-Unlearning Text-to-Image Generative Models [55.82190434534429]
Large-scale generative models have shown impressive image-generation capabilities, propelled by massive data.<n>This often inadvertently leads to the generation of harmful or inappropriate content and raises copyright concerns.<n>We propose a framework that seeks an optimal model update at each unlearning iteration, ensuring monotonic improvement on both objectives.
arXiv Detail & Related papers (2024-12-09T21:36:10Z)
Interactive Visual Assessment for Text-to-Image Generation Models [28.526897072724662]
We propose DyEval, a dynamic interactive visual assessment framework for generative models. DyEval features an intuitive visual interface that enables users to interactively explore and analyze model behaviors. Our framework provides valuable insights for improving generative models and has broad implications for advancing the reliability and capabilities of visual generation systems.
arXiv Detail & Related papers (2024-11-23T10:06:18Z)
Autoregressive Models in Vision: A Survey [119.23742136065307]
This survey comprehensively examines the literature on autoregressive models applied to vision. We divide visual autoregressive models into three general sub-categories, including pixel-based, token-based, and scale-based models. We present a multi-faceted categorization of autoregressive models in computer vision, including image generation, video generation, 3D generation, and multi-modal generation.
arXiv Detail & Related papers (2024-11-08T17:15:12Z)
Understanding the Limits of Vision Language Models Through the Lens of the Binding Problem [37.27516441519387]
We show that state-of-the-art vision language models exhibit surprising failures on basic multi-object reasoning tasks that humans perform with near perfect accuracy. We find that many of the puzzling failures of state-of-the-art VLMs can be explained as arising due to the binding problem, and that these failure modes are strikingly similar to the limitations exhibited by rapid, feedforward processing in the human brain.
arXiv Detail & Related papers (2024-10-31T22:24:47Z)
A Survey on All-in-One Image Restoration: Taxonomy, Evaluation and Future Trends [67.43992456058541]
Image restoration (IR) refers to the process of improving visual quality of images while removing degradation, such as noise, blur, weather effects, and so on. Traditional IR methods typically target specific types of degradation, which limits their effectiveness in real-world scenarios with complex distortions. The all-in-one image restoration (AiOIR) paradigm has emerged, offering a unified framework that adeptly addresses multiple degradation types.
arXiv Detail & Related papers (2024-10-19T11:11:09Z)
Exploring Social Media Image Categorization Using Large Models with Different Adaptation Methods: A Case Study on Cultural Nature's Contributions to People [1.7736307382785161]
Social media images provide valuable insights for modeling, mapping, and understanding human interactions with natural and cultural heritage.<n> categorizing these images into semantically meaningful groups remains highly complex due to the vast diversity and heterogeneity of their visual content.<n>We introduce FLIPS a dataset of Flickr images that capture the interaction between human and nature.<n>We evaluate various solutions based on different types and combinations of large models using various adaptation methods.
arXiv Detail & Related papers (2024-09-30T23:04:55Z)
Information Theoretic Text-to-Image Alignment [49.396917351264655]
We present a novel method that relies on an information-theoretic alignment measure to steer image generation. Our method is on-par or superior to the state-of-the-art, yet requires nothing but a pre-trained denoising network to estimate MI.
arXiv Detail & Related papers (2024-05-31T12:20:02Z)
Situating the social issues of image generation models in the model life cycle: a sociotechnical approach [20.99805435959377]
This paper reports on a novel, comprehensive categorization of the social issues associated with image generation models. We identify seven issue clusters arising from image generation models: data issues, intellectual property, bias, privacy, and the impacts on the informational, cultural, and natural environments. We argue that the risks posed by image generation models are comparable in severity to the risks posed by large language models.
arXiv Detail & Related papers (2023-11-30T08:32:32Z)
Foundational Models Defining a New Era in Vision: A Survey and Outlook [151.49434496615427]
Vision systems to see and reason about the compositional nature of visual scenes are fundamental to understanding our world. The models learned to bridge the gap between such modalities coupled with large-scale training data facilitate contextual reasoning, generalization, and prompt capabilities at test time. The output of such models can be modified through human-provided prompts without retraining, e.g., segmenting a particular object by providing a bounding box, having interactive dialogues by asking questions about an image or video scene or manipulating the robot's behavior through language instructions.
arXiv Detail & Related papers (2023-07-25T17:59:18Z)
Sample-Efficient Learning of Novel Visual Concepts [7.398195748292981]
State-of-the-art deep learning models struggle to recognize novel objects in a few-shot setting. We show that incorporating a symbolic knowledge graph into a state-of-the-art recognition model enables a new approach for effective few-shot classification.
arXiv Detail & Related papers (2023-06-15T20:24:30Z)
Causal Reasoning Meets Visual Representation Learning: A Prospective Study [117.08431221482638]
Lack of interpretability, robustness, and out-of-distribution generalization are becoming the challenges of the existing visual models. Inspired by the strong inference ability of human-level agents, recent years have witnessed great effort in developing causal reasoning paradigms. This paper aims to provide a comprehensive overview of this emerging field, attract attention, encourage discussions, bring to the forefront the urgency of developing novel causal reasoning methods.
arXiv Detail & Related papers (2022-04-26T02:22:28Z)
On the Opportunities and Risks of Foundation Models [256.61956234436553]
We call these models foundation models to underscore their critically central yet incomplete character. This report provides a thorough account of the opportunities and risks of foundation models. To tackle these questions, we believe much of the critical research on foundation models will require deep interdisciplinary collaboration.
arXiv Detail & Related papers (2021-08-16T17:50:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.