T-HITL Effectively Addresses Problematic Associations in Image
Generation and Maintains Overall Visual Quality
- URL: http://arxiv.org/abs/2402.17101v1
- Date: Tue, 27 Feb 2024 00:29:33 GMT
- Title: T-HITL Effectively Addresses Problematic Associations in Image
Generation and Maintains Overall Visual Quality
- Authors: Susan Epstein, Li Chen, Alessandro Vecchiato, Ankit Jain
- Abstract summary: We focus on addressing the generation of problematic associations between demographic groups and semantic concepts.
We propose a new methodology with twice-human-in-the-loop (T-HITL) that promises improvements in both reducing problematic associations and also maintaining visual quality.
- Score: 52.5529784801908
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generative AI image models may inadvertently generate problematic
representations of people. Past research has noted that millions of users
engage daily across the world with these models and that the models, including
through problematic representations of people, have the potential to compound
and accelerate real-world discrimination and other harms (Bianchi et al, 2023).
In this paper, we focus on addressing the generation of problematic
associations between demographic groups and semantic concepts that may reflect
and reinforce negative narratives embedded in social data. Building on
sociological literature (Blumer, 1958) and mapping representations to model
behaviors, we have developed a taxonomy to study problematic associations in
image generation models. We explore the effectiveness of fine tuning at the
model level as a method to address these associations, identifying a potential
reduction in visual quality as a limitation of traditional fine tuning. We also
propose a new methodology with twice-human-in-the-loop (T-HITL) that promises
improvements in both reducing problematic associations and also maintaining
visual quality. We demonstrate the effectiveness of T-HITL by providing
evidence of three problematic associations addressed by T-HITL at the model
level. Our contributions to scholarship are two-fold. By defining problematic
associations in the context of machine learning models and generative AI, we
introduce a conceptual and technical taxonomy for addressing some of these
associations. Finally, we provide a method, T-HITL, that addresses these
associations and simultaneously maintains visual quality of image model
generations. This mitigation need not be a tradeoff, but rather an enhancement.
Related papers
- Autoregressive Models in Vision: A Survey [119.23742136065307]
This survey comprehensively examines the literature on autoregressive models applied to vision.
We divide visual autoregressive models into three general sub-categories, including pixel-based, token-based, and scale-based models.
We present a multi-faceted categorization of autoregressive models in computer vision, including image generation, video generation, 3D generation, and multi-modal generation.
arXiv Detail & Related papers (2024-11-08T17:15:12Z) - Understanding the Limits of Vision Language Models Through the Lens of the Binding Problem [37.27516441519387]
We show that state-of-the-art vision language models exhibit surprising failures on basic multi-object reasoning tasks that humans perform with near perfect accuracy.
We find that many of the puzzling failures of state-of-the-art VLMs can be explained as arising due to the binding problem, and that these failure modes are strikingly similar to the limitations exhibited by rapid, feedforward processing in the human brain.
arXiv Detail & Related papers (2024-10-31T22:24:47Z) - A Survey on All-in-One Image Restoration: Taxonomy, Evaluation and Future Trends [67.43992456058541]
Image restoration (IR) refers to the process of improving visual quality of images while removing degradation, such as noise, blur, weather effects, and so on.
Traditional IR methods typically target specific types of degradation, which limits their effectiveness in real-world scenarios with complex distortions.
The all-in-one image restoration (AiOIR) paradigm has emerged, offering a unified framework that adeptly addresses multiple degradation types.
arXiv Detail & Related papers (2024-10-19T11:11:09Z) - Information Theoretic Text-to-Image Alignment [49.396917351264655]
We present a novel method that relies on an information-theoretic alignment measure to steer image generation.
Our method is on-par or superior to the state-of-the-art, yet requires nothing but a pre-trained denoising network to estimate MI.
arXiv Detail & Related papers (2024-05-31T12:20:02Z) - Situating the social issues of image generation models in the model life cycle: a sociotechnical approach [20.99805435959377]
This paper reports on a novel, comprehensive categorization of the social issues associated with image generation models.
We identify seven issue clusters arising from image generation models: data issues, intellectual property, bias, privacy, and the impacts on the informational, cultural, and natural environments.
We argue that the risks posed by image generation models are comparable in severity to the risks posed by large language models.
arXiv Detail & Related papers (2023-11-30T08:32:32Z) - Foundational Models Defining a New Era in Vision: A Survey and Outlook [151.49434496615427]
Vision systems to see and reason about the compositional nature of visual scenes are fundamental to understanding our world.
The models learned to bridge the gap between such modalities coupled with large-scale training data facilitate contextual reasoning, generalization, and prompt capabilities at test time.
The output of such models can be modified through human-provided prompts without retraining, e.g., segmenting a particular object by providing a bounding box, having interactive dialogues by asking questions about an image or video scene or manipulating the robot's behavior through language instructions.
arXiv Detail & Related papers (2023-07-25T17:59:18Z) - Sample-Efficient Learning of Novel Visual Concepts [7.398195748292981]
State-of-the-art deep learning models struggle to recognize novel objects in a few-shot setting.
We show that incorporating a symbolic knowledge graph into a state-of-the-art recognition model enables a new approach for effective few-shot classification.
arXiv Detail & Related papers (2023-06-15T20:24:30Z) - Causal Reasoning Meets Visual Representation Learning: A Prospective
Study [117.08431221482638]
Lack of interpretability, robustness, and out-of-distribution generalization are becoming the challenges of the existing visual models.
Inspired by the strong inference ability of human-level agents, recent years have witnessed great effort in developing causal reasoning paradigms.
This paper aims to provide a comprehensive overview of this emerging field, attract attention, encourage discussions, bring to the forefront the urgency of developing novel causal reasoning methods.
arXiv Detail & Related papers (2022-04-26T02:22:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.