Related papers: Inpaint Biases: A Pathway to Accurate and Unbiased Image Generation

Inpaint Biases: A Pathway to Accurate and Unbiased Image Generation

URL: http://arxiv.org/abs/2405.18762v2
Date: Thu, 30 May 2024 10:58:56 GMT
Title: Inpaint Biases: A Pathway to Accurate and Unbiased Image Generation
Authors: Jiyoon Myung, Jihyeon Park,
Abstract summary: We introduce the Inpaint Biases framework, which employs user-defined masks and inpainting techniques to enhance the accuracy of image generation. We demonstrate how this framework significantly improves the fidelity of generated images to the user's intent, thereby expanding the models' creative capabilities.
Score: 0.0
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: This paper examines the limitations of advanced text-to-image models in accurately rendering unconventional concepts which are scarcely represented or absent in their training datasets. We identify how these limitations not only confine the creative potential of these models but also pose risks of reinforcing stereotypes. To address these challenges, we introduce the Inpaint Biases framework, which employs user-defined masks and inpainting techniques to enhance the accuracy of image generation, particularly for novel or inaccurately rendered objects. Through experimental validation, we demonstrate how this framework significantly improves the fidelity of generated images to the user's intent, thereby expanding the models' creative capabilities and mitigating the risk of perpetuating biases. Our study contributes to the advancement of text-to-image models as unbiased, versatile tools for creative expression.

Related papers

X-Omni: Reinforcement Learning Makes Discrete Autoregressive Image Generative Models Great Again [45.74833463136701]
We develop a semantic image tokenizer, a unified autoregressive model for both language and images, and an offline diffusion decoder for image generation termed X- Omni.<n>X- Omni achieves state-of-the-art performance in image generation tasks using a 7B language model, producing images with high aesthetic quality while exhibiting strong capabilities in following instructions and rendering long texts.
arXiv Detail & Related papers (2025-07-29T17:59:04Z)
Where's the liability in the Generative Era? Recovery-based Black-Box Detection of AI-Generated Content [42.68683643671603]
We introduce a novel black box detection framework that requires only API access.<n>We measure the likelihood that the image was generated by the model itself.<n>For black-box models that do not support masked image inputs, we incorporate a cost efficient surrogate model trained to align with the target model distribution.
arXiv Detail & Related papers (2025-05-02T05:11:35Z)
Boosting Alignment for Post-Unlearning Text-to-Image Generative Models [55.82190434534429]
Large-scale generative models have shown impressive image-generation capabilities, propelled by massive data.<n>This often inadvertently leads to the generation of harmful or inappropriate content and raises copyright concerns.<n>We propose a framework that seeks an optimal model update at each unlearning iteration, ensuring monotonic improvement on both objectives.
arXiv Detail & Related papers (2024-12-09T21:36:10Z)
PrefPaint: Aligning Image Inpainting Diffusion Model with Human Preference [62.72779589895124]
We make the first attempt to align diffusion models for image inpainting with human aesthetic standards via a reinforcement learning framework. We train a reward model with a dataset we construct, consisting of nearly 51,000 images annotated with human preferences. Experiments on inpainting comparison and downstream tasks, such as image extension and 3D reconstruction, demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2024-10-29T11:49:39Z)
KITTEN: A Knowledge-Intensive Evaluation of Image Generation on Visual Entities [93.74881034001312]
We conduct a systematic study on the fidelity of entities in text-to-image generation models. We focus on their ability to generate a wide range of real-world visual entities, such as landmark buildings, aircraft, plants, and animals. Our findings reveal that even the most advanced text-to-image models often fail to generate entities with accurate visual details.
arXiv Detail & Related papers (2024-10-15T17:50:37Z)
Towards Effective User Attribution for Latent Diffusion Models via Watermark-Informed Blending [54.26862913139299]
We introduce a novel framework Towards Effective user Attribution for latent diffusion models via Watermark-Informed Blending (TEAWIB) TEAWIB incorporates a unique ready-to-use configuration approach that allows seamless integration of user-specific watermarks into generative models. Experiments validate the effectiveness of TEAWIB, showcasing the state-of-the-art performance in perceptual quality and attribution accuracy.
arXiv Detail & Related papers (2024-09-17T07:52:09Z)
Reinforcing Pre-trained Models Using Counterfactual Images [54.26310919385808]
This paper proposes a novel framework to reinforce classification models using language-guided generated counterfactual images. We identify model weaknesses by testing the model using the counterfactual image dataset. We employ the counterfactual images as an augmented dataset to fine-tune and reinforce the classification model.
arXiv Detail & Related papers (2024-06-19T08:07:14Z)
Regeneration Based Training-free Attribution of Fake Images Generated by Text-to-Image Generative Models [39.33821502730661]
We present a training-free method to attribute fake images generated by text-to-image models to their source models. By calculating and ranking the similarity of the test image and the candidate images, we can determine the source of the image.
arXiv Detail & Related papers (2024-03-03T11:55:49Z)
Semantic Draw Engineering for Text-to-Image Creation [2.615648035076649]
We propose a method that utilizes artificial intelligence models for thematic creativity. The method involves converting all visual elements into quantifiable data structures before creating images. We evaluate the effectiveness of this approach in terms of semantic accuracy, image efficiency, and computational efficiency.
arXiv Detail & Related papers (2023-12-23T05:35:15Z)
DreamArtist: Towards Controllable One-Shot Text-to-Image Generation via Positive-Negative Prompt-Tuning [85.10894272034135]
Large-scale text-to-image generation models have achieved remarkable progress in synthesizing high-quality, feature-rich images with high resolution guided by texts. Recent attempts have employed fine-tuning or prompt-tuning strategies to teach the pre-trained diffusion model novel concepts from a reference image set. We present a simple yet effective method called DreamArtist, which employs a positive-negative prompt-tuning learning strategy.
arXiv Detail & Related papers (2022-11-21T10:37:56Z)
Language Does More Than Describe: On The Lack Of Figurative Speech in Text-To-Image Models [63.545146807810305]
Text-to-image diffusion models can generate high-quality pictures from textual input prompts. These models have been trained using text data collected from content-based labelling protocols. We characterise the sentimentality, objectiveness and degree of abstraction of publicly available text data used to train current text-to-image diffusion models.
arXiv Detail & Related papers (2022-10-19T14:20:05Z)
Adaptively-Realistic Image Generation from Stroke and Sketch with Diffusion Model [31.652827838300915]
We propose a unified framework supporting a three-dimensional control over the image synthesis from sketches and strokes based on diffusion models. Our framework achieves state-of-the-art performance while providing flexibility in generating customized images with control over shape, color, and realism. Our method unleashes applications such as editing on real images, generation with partial sketches and strokes, and multi-domain multi-modal synthesis.
arXiv Detail & Related papers (2022-08-26T13:59:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.