Hallucination Detection for Grounded Instruction Generation
- URL: http://arxiv.org/abs/2310.15319v1
- Date: Mon, 23 Oct 2023 19:36:28 GMT
- Title: Hallucination Detection for Grounded Instruction Generation
- Authors: Lingjun Zhao, Khanh Nguyen, Hal Daum\'e III
- Abstract summary: A major issue with current models is hallucination.
We develop a model that detects these hallucinations by adopting a model pre-trained on a large corpus of image-text pairs.
- Score: 8.432152982202785
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We investigate the problem of generating instructions to guide humans to
navigate in simulated residential environments. A major issue with current
models is hallucination: they generate references to actions or objects that
are inconsistent with what a human follower would perform or encounter along
the described path. We develop a model that detects these hallucinated
references by adopting a model pre-trained on a large corpus of image-text
pairs, and fine-tuning it with a contrastive loss that separates correct
instructions from instructions containing synthesized hallucinations. Our final
model outperforms several baselines, including using word probability estimated
by the instruction-generation model, and supervised models based on LSTM and
Transformer.
Related papers
- From Loops to Oops: Fallback Behaviors of Language Models Under Uncertainty [67.81977289444677]
Large language models (LLMs) often exhibit undesirable behaviors, such as hallucinations and sequence repetitions.
We categorize fallback behaviors -- sequence repetitions, degenerate text, and hallucinations -- and extensively analyze them.
Our experiments reveal a clear and consistent ordering of fallback behaviors, across all these axes.
arXiv Detail & Related papers (2024-07-08T16:13:42Z) - Detecting and Mitigating Hallucination in Large Vision Language Models via Fine-Grained AI Feedback [48.065569871444275]
We propose detecting and mitigating hallucinations in Large Vision Language Models (LVLMs) via fine-grained AI feedback.
We generate a small-size hallucination annotation dataset by proprietary models.
Then, we propose a detect-then-rewrite pipeline to automatically construct preference dataset for training hallucination mitigating model.
arXiv Detail & Related papers (2024-04-22T14:46:10Z) - Prescribing the Right Remedy: Mitigating Hallucinations in Large Vision-Language Models via Targeted Instruction Tuning [15.156359255401812]
We propose a targeted instruction data generation framework named DFTG that tailored to the hallucination specificity of different models.
The experimental results on hallucination benchmarks demonstrate that the targeted instruction data generated by our method are more effective in mitigating hallucinations compared to previous datasets.
arXiv Detail & Related papers (2024-04-16T07:14:32Z) - Unfamiliar Finetuning Examples Control How Language Models Hallucinate [75.03210107477157]
Large language models are known to hallucinate when faced with unfamiliar queries.
We find that unfamiliar examples in the models' finetuning data are crucial in shaping these errors.
Our work further investigates RL finetuning strategies for improving the factuality of long-form model generations.
arXiv Detail & Related papers (2024-03-08T18:28:13Z) - Aligning Modalities in Vision Large Language Models via Preference
Fine-tuning [67.62925151837675]
In this work, we frame the hallucination problem as an alignment issue, tackle it with preference tuning.
Specifically, we propose POVID to generate feedback data with AI models.
We use ground-truth instructions as the preferred response and a two-stage approach to generate dispreferred data.
In experiments across broad benchmarks, we show that we can not only reduce hallucinations, but improve model performance across standard benchmarks, outperforming prior approaches.
arXiv Detail & Related papers (2024-02-18T00:56:16Z) - AutoHall: Automated Hallucination Dataset Generation for Large Language Models [56.92068213969036]
This paper introduces a method for automatically constructing model-specific hallucination datasets based on existing fact-checking datasets called AutoHall.
We also propose a zero-resource and black-box hallucination detection method based on self-contradiction.
arXiv Detail & Related papers (2023-09-30T05:20:02Z) - Detecting and Preventing Hallucinations in Large Vision Language Models [4.7264116948935975]
M-HalDetect is the first multi-modal hallucination detection dataset for detailed image descriptions.
We train fine-grained multi-modal reward models from InstructBLIP and evaluate their effectiveness with best-of-n rejection sampling.
We find that our reward model generalizes to other multi-modal models, reducing hallucinations in LLaVA and mPLUG-OWL by 15% and 57% respectively.
arXiv Detail & Related papers (2023-08-11T21:35:20Z) - Diving Deep into Modes of Fact Hallucinations in Dialogue Systems [2.8360662552057323]
Knowledge Graph(KG) grounded conversations often use large pre-trained models and usually suffer from fact hallucination.
We build an entity-level hallucination detection system, which would provide fine-grained signals that control fallacious content while generating responses.
arXiv Detail & Related papers (2023-01-11T13:08:57Z) - To what extent do human explanations of model behavior align with actual
model behavior? [91.67905128825402]
We investigated the extent to which human-generated explanations of models' inference decisions align with how models actually make these decisions.
We defined two alignment metrics that quantify how well natural language human explanations align with model sensitivity to input words.
We find that a model's alignment with human explanations is not predicted by the model's accuracy on NLI.
arXiv Detail & Related papers (2020-12-24T17:40:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.