AIGCs Confuse AI Too: Investigating and Explaining Synthetic Image-induced Hallucinations in Large Vision-Language Models
- URL: http://arxiv.org/abs/2403.08542v2
- Date: Tue, 3 Sep 2024 01:53:19 GMT
- Title: AIGCs Confuse AI Too: Investigating and Explaining Synthetic Image-induced Hallucinations in Large Vision-Language Models
- Authors: Yifei Gao, Jiaqi Wang, Zhiyu Lin, Jitao Sang,
- Abstract summary: We highlight the exacerbated hallucination phenomena in Large Vision-Language Models (LVLMs) caused by AI-synthetic images.
Remarkably, our findings shed light on a consistent AIGC textbfhallucination bias: the object hallucinations induced by synthetic images are characterized by a greater quantity.
Our investigations on Q-former and Linear projector reveal that synthetic images may present token deviations after visual projection, thereby amplifying the hallucination bias.
- Score: 37.04195231708092
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The evolution of Artificial Intelligence Generated Contents (AIGCs) is advancing towards higher quality. The growing interactions with AIGCs present a new challenge to the data-driven AI community: While AI-generated contents have played a crucial role in a wide range of AI models, the potential hidden risks they introduce have not been thoroughly examined. Beyond human-oriented forgery detection, AI-generated content poses potential issues for AI models originally designed to process natural data. In this study, we underscore the exacerbated hallucination phenomena in Large Vision-Language Models (LVLMs) caused by AI-synthetic images. Remarkably, our findings shed light on a consistent AIGC \textbf{hallucination bias}: the object hallucinations induced by synthetic images are characterized by a greater quantity and a more uniform position distribution, even these synthetic images do not manifest unrealistic or additional relevant visual features compared to natural images. Moreover, our investigations on Q-former and Linear projector reveal that synthetic images may present token deviations after visual projection, thereby amplifying the hallucination bias.
Related papers
- A Sanity Check for AI-generated Image Detection [49.08585395873425]
We present a sanity check on whether the task of AI-generated image detection has been solved.
To quantify the generalization of existing methods, we evaluate 9 off-the-shelf AI-generated image detectors on Chameleon dataset.
We propose AIDE (AI-generated Image DEtector with Hybrid Features), which leverages multiple experts to simultaneously extract visual artifacts and noise patterns.
arXiv Detail & Related papers (2024-06-27T17:59:49Z) - Exploring the Naturalness of AI-Generated Images [59.04528584651131]
We take the first step to benchmark and assess the visual naturalness of AI-generated images.
We propose the Joint Objective Image Naturalness evaluaTor (JOINT), to automatically predict the naturalness of AGIs that aligns human ratings.
We demonstrate that JOINT significantly outperforms baselines for providing more subjectively consistent results on naturalness assessment.
arXiv Detail & Related papers (2023-12-09T06:08:09Z) - Invisible Relevance Bias: Text-Image Retrieval Models Prefer AI-Generated Images [67.18010640829682]
We show that AI-generated images introduce an invisible relevance bias to text-image retrieval models.
The inclusion of AI-generated images in the training data of the retrieval models exacerbates the invisible relevance bias.
We propose an effective training method aimed at alleviating the invisible relevance bias.
arXiv Detail & Related papers (2023-11-23T16:22:58Z) - AI-Generated Images as Data Source: The Dawn of Synthetic Era [61.879821573066216]
generative AI has unlocked the potential to create synthetic images that closely resemble real-world photographs.
This paper explores the innovative concept of harnessing these AI-generated images as new data sources.
In contrast to real data, AI-generated data exhibit remarkable advantages, including unmatched abundance and scalability.
arXiv Detail & Related papers (2023-10-03T06:55:19Z) - Augmenting medical image classifiers with synthetic data from latent
diffusion models [12.077733447347592]
We show that latent diffusion models can scalably generate images of skin disease.
We generate and analyze a new dataset of 458,920 synthetic images produced using several generation strategies.
arXiv Detail & Related papers (2023-08-23T22:34:49Z) - DeepfakeArt Challenge: A Benchmark Dataset for Generative AI Art Forgery and Data Poisoning Detection [57.51313366337142]
There has been growing concern over the use of generative AI for malicious purposes.
In the realm of visual content synthesis using generative AI, key areas of significant concern has been image forgery and data poisoning.
We introduce the DeepfakeArt Challenge, a large-scale challenge benchmark dataset designed specifically to aid in the building of machine learning algorithms for generative AI art forgery and data poisoning detection.
arXiv Detail & Related papers (2023-06-02T05:11:27Z) - The Beauty or the Beast: Which Aspect of Synthetic Medical Images
Deserves Our Focus? [1.6305276867803995]
Training medical AI algorithms requires large volumes of accurately labeled datasets.
Synthetic images generated from deep generative models can help alleviate the data scarcity problem, but their effectiveness relies on their fidelity to real-world images.
arXiv Detail & Related papers (2023-05-03T09:09:54Z) - ViT-DAE: Transformer-driven Diffusion Autoencoder for Histopathology
Image Analysis [4.724009208755395]
We present ViT-DAE, which integrates vision transformers (ViT) and diffusion autoencoders for high-quality histopathology image synthesis.
Our approach outperforms recent GAN-based and vanilla DAE methods in generating realistic images.
arXiv Detail & Related papers (2023-04-03T15:00:06Z) - CIFAKE: Image Classification and Explainable Identification of
AI-Generated Synthetic Images [7.868449549351487]
This article proposes to enhance our ability to recognise AI-generated images through computer vision.
The two sets of data present as a binary classification problem with regard to whether the photograph is real or generated by AI.
This study proposes the use of a Convolutional Neural Network (CNN) to classify the images into two categories; Real or Fake.
arXiv Detail & Related papers (2023-03-24T16:33:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.