Interpreting and Mitigating Hallucination in MLLMs through Multi-agent Debate
- URL: http://arxiv.org/abs/2407.20505v1
- Date: Tue, 30 Jul 2024 02:41:32 GMT
- Title: Interpreting and Mitigating Hallucination in MLLMs through Multi-agent Debate
- Authors: Zheng Lin, Zhenxing Niu, Zhibin Wang, Yinghui Xu,
- Abstract summary: We argue that hallucination in MLLMs is partially due to a lack of slow-thinking and divergent-thinking in these models.
Our approach can not only hallucinations but also interpret why they occur and detail the specifics of hallucination.
- Score: 34.17353224636788
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: MLLMs often generate outputs that are inconsistent with the visual content, a challenge known as hallucination. Previous methods focus on determining whether a generated output is hallucinated, without identifying which image region leads to the hallucination or interpreting why such hallucinations occur. In this paper, we argue that hallucination in MLLMs is partially due to a lack of slow-thinking and divergent-thinking in these models. To address this, we propose adopting a self-reflection scheme to promote slow-thinking. Furthermore, we consider eliminating hallucination as a complex reasoning task and propose a multi-agent debate approach to encourage divergent-thinking. Consequently, our approach can not only mitigate hallucinations but also interpret why they occur and detail the specifics of hallucination. In addition, we propose to distinguish creativity from hallucination in the context of MLLMs, and illustrate how to evaluate MLLMs' creativity capability. Extensive experiments on various benchmarks demonstrate that our approach exhibits generalized hallucinations-mitigating performance across several MLLMs.
Related papers
- Hallu-PI: Evaluating Hallucination in Multi-modal Large Language Models within Perturbed Inputs [54.50483041708911]
Hallu-PI is the first benchmark designed to evaluate hallucination in MLLMs within Perturbed Inputs.
Hallu-PI consists of seven perturbed scenarios, containing 1,260 perturbed images from 11 object types.
Our research reveals a severe bias in MLLMs' ability to handle different types of hallucinations.
arXiv Detail & Related papers (2024-08-02T16:07:15Z) - Hallucination Diversity-Aware Active Learning for Text Summarization [46.00645048690819]
Large Language Models (LLMs) have shown propensity to generate hallucinated outputs, i.e., texts that are factually incorrect or unsupported.
Existing methods for alleviating hallucinations typically require costly human annotations to identify and correct hallucinations in LLM outputs.
We propose the first active learning framework to alleviate LLM hallucinations, reducing costly human annotations of hallucination needed.
arXiv Detail & Related papers (2024-04-02T02:30:27Z) - Pensieve: Retrospect-then-Compare Mitigates Visual Hallucination [14.25488878224697]
We propose Pensieve, a training-free method that leverages the analogous visual hallucinations, which are induced by images sharing common semantic and appearance characteristics.
Pensieve mitigates the effects of addressing errors from both the visual and textual branches by adaptively scaling the subtracted scores.
arXiv Detail & Related papers (2024-03-21T13:49:42Z) - Hal-Eval: A Universal and Fine-grained Hallucination Evaluation Framework for Large Vision Language Models [35.45859414670449]
We introduce a refined taxonomy of hallucinations, featuring a new category: Event Hallucination.
We then utilize advanced LLMs to generate and filter fine grained hallucinatory data consisting of various types of hallucinations.
The proposed benchmark distinctively assesses LVLMs ability to tackle a broad spectrum of hallucinations.
arXiv Detail & Related papers (2024-02-24T05:14:52Z) - A Survey on Hallucination in Large Vision-Language Models [18.540878498840435]
Large Vision-Language Models (LVLMs) have attracted growing attention within the AI landscape for its practical implementation potential.
However, hallucination'', or more specifically, the misalignment between factual visual content and corresponding textual generation, poses a significant challenge of utilizing LVLMs.
We dissect LVLM-related hallucinations in an attempt to establish an overview and facilitate future mitigation.
arXiv Detail & Related papers (2024-02-01T00:33:21Z) - The Dawn After the Dark: An Empirical Study on Factuality Hallucination
in Large Language Models [134.6697160940223]
hallucination poses great challenge to trustworthy and reliable deployment of large language models.
Three key questions should be well studied: how to detect hallucinations (detection), why do LLMs hallucinate (source), and what can be done to mitigate them.
This work presents a systematic empirical study on LLM hallucination, focused on the the three aspects of hallucination detection, source and mitigation.
arXiv Detail & Related papers (2024-01-06T12:40:45Z) - Hallucination Augmented Contrastive Learning for Multimodal Large
Language Model [53.65682783591723]
Multi-modal large language models (MLLMs) have been shown to efficiently integrate natural language with visual information to handle multi-modal tasks.
However, MLLMs still face a fundamental limitation of hallucinations, where they tend to generate erroneous or fabricated information.
In this paper, we address hallucinations in MLLMs from a novel perspective of representation learning.
arXiv Detail & Related papers (2023-12-12T04:05:15Z) - HalluciDoctor: Mitigating Hallucinatory Toxicity in Visual Instruction Data [102.56792377624927]
hallucinations inherent in machine-generated data remain under-explored.
We present a novel hallucination detection and elimination framework, HalluciDoctor, based on the cross-checking paradigm.
Our method successfully mitigates 44.6% hallucinations relatively and maintains competitive performance compared to LLaVA.
arXiv Detail & Related papers (2023-11-22T04:52:58Z) - A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions [40.79317187623401]
The emergence of large language models (LLMs) has marked a significant breakthrough in natural language processing (NLP)
LLMs are prone to hallucination, generating plausible yet nonfactual content.
This phenomenon raises significant concerns over the reliability of LLMs in real-world information retrieval systems.
arXiv Detail & Related papers (2023-11-09T09:25:37Z) - Evaluation and Analysis of Hallucination in Large Vision-Language Models [49.19829480199372]
Large Vision-Language Models (LVLMs) have recently achieved remarkable success.
LVLMs are still plagued by the hallucination problem.
Hallucination refers to the information of LVLMs' responses that does not exist in the visual input.
arXiv Detail & Related papers (2023-08-29T08:51:24Z) - Evaluating Object Hallucination in Large Vision-Language Models [122.40337582958453]
This work presents the first systematic study on object hallucination of large vision-language models (LVLMs)
We find that LVLMs tend to generate objects that are inconsistent with the target images in the descriptions.
We propose a polling-based query method called POPE to evaluate the object hallucination.
arXiv Detail & Related papers (2023-05-17T16:34:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.