Detecting and Preventing Hallucinations in Large Vision Language Models
- URL: http://arxiv.org/abs/2308.06394v3
- Date: Sun, 11 Feb 2024 08:38:07 GMT
- Title: Detecting and Preventing Hallucinations in Large Vision Language Models
- Authors: Anisha Gunjal, Jihan Yin, Erhan Bas
- Abstract summary: M-HalDetect is the first multi-modal hallucination detection dataset for detailed image descriptions.
We train fine-grained multi-modal reward models from InstructBLIP and evaluate their effectiveness with best-of-n rejection sampling.
We find that our reward model generalizes to other multi-modal models, reducing hallucinations in LLaVA and mPLUG-OWL by 15% and 57% respectively.
- Score: 4.7264116948935975
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Instruction tuned Large Vision Language Models (LVLMs) have significantly
advanced in generalizing across a diverse set of multi-modal tasks, especially
for Visual Question Answering (VQA). However, generating detailed responses
that are visually grounded is still a challenging task for these models. We
find that even the current state-of-the-art LVLMs (InstructBLIP) still contain
a staggering 30 percent of the hallucinatory text in the form of non-existent
objects, unfaithful descriptions, and inaccurate relationships. To address
this, we introduce M-HalDetect, a (M)ultimodal (Hal)lucination (Detect)ion
Dataset that can be used to train and benchmark models for hallucination
detection and prevention. M-HalDetect consists of 16k fine-grained annotations
on VQA examples, making it the first comprehensive multi-modal hallucination
detection dataset for detailed image descriptions. Unlike previous work that
only consider object hallucination, we additionally annotate both entity
descriptions and relationships that are unfaithful. To demonstrate the
potential of this dataset for hallucination prevention, we optimize
InstructBLIP through our novel Fine-grained Direct Preference Optimization
(FDPO). We also train fine-grained multi-modal reward models from InstructBLIP
and evaluate their effectiveness with best-of-n rejection sampling. We perform
human evaluation on both FDPO and rejection sampling, and find that they reduce
hallucination rates in InstructBLIP by 41% and 55% respectively. We also find
that our reward model generalizes to other multi-modal models, reducing
hallucinations in LLaVA and mPLUG-OWL by 15% and 57% respectively, and has
strong correlation with human evaluated accuracy scores.
Related papers
- Reefknot: A Comprehensive Benchmark for Relation Hallucination Evaluation, Analysis and Mitigation in Multimodal Large Language Models [13.48296910438554]
Hallucination issues persistently plagued current multimodal large language models (MLLMs)
We introduce Reefknot, a benchmark specifically targeting relation hallucinations, consisting of over 20,000 samples derived from real-world scenarios.
Our comparative evaluation across three distinct tasks revealed a substantial shortcoming in the capabilities of current MLLMs to mitigate relation hallucinations.
arXiv Detail & Related papers (2024-08-18T10:07:02Z) - ANAH-v2: Scaling Analytical Hallucination Annotation of Large Language Models [65.12177400764506]
Large language models (LLMs) exhibit hallucinations in long-form question-answering tasks across various domains and wide applications.
Current hallucination detection and mitigation datasets are limited in domains and sizes.
This paper introduces an iterative self-training framework that simultaneously and progressively scales up the hallucination annotation dataset.
arXiv Detail & Related papers (2024-07-05T17:56:38Z) - Detecting and Mitigating Hallucination in Large Vision Language Models via Fine-Grained AI Feedback [48.065569871444275]
We propose detecting and mitigating hallucinations in Large Vision Language Models (LVLMs) via fine-grained AI feedback.
We generate a small-size hallucination annotation dataset by proprietary models.
Then, we propose a detect-then-rewrite pipeline to automatically construct preference dataset for training hallucination mitigating model.
arXiv Detail & Related papers (2024-04-22T14:46:10Z) - VALOR-EVAL: Holistic Coverage and Faithfulness Evaluation of Large Vision-Language Models [57.43276586087863]
Large Vision-Language Models (LVLMs) suffer from hallucination issues, wherein the models generate plausible-sounding but factually incorrect outputs.
Existing benchmarks are often limited in scope, focusing mainly on object hallucinations.
We introduce a multi-dimensional benchmark covering objects, attributes, and relations, with challenging images selected based on associative biases.
arXiv Detail & Related papers (2024-04-22T04:49:22Z) - Prescribing the Right Remedy: Mitigating Hallucinations in Large Vision-Language Models via Targeted Instruction Tuning [15.156359255401812]
We propose a targeted instruction data generation framework named DFTG that tailored to the hallucination specificity of different models.
The experimental results on hallucination benchmarks demonstrate that the targeted instruction data generated by our method are more effective in mitigating hallucinations compared to previous datasets.
arXiv Detail & Related papers (2024-04-16T07:14:32Z) - Multi-Modal Hallucination Control by Visual Information Grounding [121.6983694815504]
We show that Generative Vision-Language Models (VLMs) are prone to generate plausible-sounding textual answers that are not always grounded in the input image.
We introduce Multi-Modal Mutual-Information Decoding (M3ID), a new sampling method for prompt amplification.
M3ID amplifies the influence of the reference image over the language prior, hence favoring the generation of tokens with higher mutual information with the visual prompt.
arXiv Detail & Related papers (2024-03-20T22:05:18Z) - Aligning Modalities in Vision Large Language Models via Preference
Fine-tuning [67.62925151837675]
In this work, we frame the hallucination problem as an alignment issue, tackle it with preference tuning.
Specifically, we propose POVID to generate feedback data with AI models.
We use ground-truth instructions as the preferred response and a two-stage approach to generate dispreferred data.
In experiments across broad benchmarks, we show that we can not only reduce hallucinations, but improve model performance across standard benchmarks, outperforming prior approaches.
arXiv Detail & Related papers (2024-02-18T00:56:16Z) - Mitigating Hallucination in Visual Language Models with Visual
Supervision [33.05550629039951]
Large vision-language models (LVLMs) suffer from hallucination a lot.
Key problem lies in its weak ability to comprehend detailed content in a multi-modal context.
In this paper, we bring more detailed vision annotations and more discriminative vision models to facilitate the training of LVLMs.
arXiv Detail & Related papers (2023-11-27T09:30:02Z) - AutoHall: Automated Hallucination Dataset Generation for Large Language Models [56.92068213969036]
This paper introduces a method for automatically constructing model-specific hallucination datasets based on existing fact-checking datasets called AutoHall.
We also propose a zero-resource and black-box hallucination detection method based on self-contradiction.
arXiv Detail & Related papers (2023-09-30T05:20:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.