Diving Deep into Modes of Fact Hallucinations in Dialogue Systems
- URL: http://arxiv.org/abs/2301.04449v1
- Date: Wed, 11 Jan 2023 13:08:57 GMT
- Title: Diving Deep into Modes of Fact Hallucinations in Dialogue Systems
- Authors: Souvik Das, Sougata Saha and Rohini K. Srihari
- Abstract summary: Knowledge Graph(KG) grounded conversations often use large pre-trained models and usually suffer from fact hallucination.
We build an entity-level hallucination detection system, which would provide fine-grained signals that control fallacious content while generating responses.
- Score: 2.8360662552057323
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Knowledge Graph(KG) grounded conversations often use large pre-trained models
and usually suffer from fact hallucination. Frequently entities with no
references in knowledge sources and conversation history are introduced into
responses, thus hindering the flow of the conversation -- existing work attempt
to overcome this issue by tweaking the training procedure or using a multi-step
refining method. However, minimal effort is put into constructing an
entity-level hallucination detection system, which would provide fine-grained
signals that control fallacious content while generating responses. As a first
step to address this issue, we dive deep to identify various modes of
hallucination in KG-grounded chatbots through human feedback analysis.
Secondly, we propose a series of perturbation strategies to create a synthetic
dataset named FADE (FActual Dialogue Hallucination DEtection Dataset). Finally,
we conduct comprehensive data analyses and create multiple baseline models for
hallucination detection to compare against human-verified data and already
established benchmarks.
Related papers
- Detecting and Mitigating Hallucination in Large Vision Language Models via Fine-Grained AI Feedback [48.065569871444275]
We propose detecting and mitigating hallucinations in Large Vision Language Models (LVLMs) via fine-grained AI feedback.
We generate a small-size hallucination annotation dataset by proprietary models.
Then, we propose a detect-then-rewrite pipeline to automatically construct preference dataset for training hallucination mitigating model.
arXiv Detail & Related papers (2024-04-22T14:46:10Z) - A Cause-Effect Look at Alleviating Hallucination of Knowledge-grounded Dialogue Generation [51.53917938874146]
We propose a possible solution for alleviating the hallucination in KGD by exploiting the dialogue-knowledge interaction.
Experimental results of our example implementation show that this method can reduce hallucination without disrupting other dialogue performance.
arXiv Detail & Related papers (2024-04-04T14:45:26Z) - DiaHalu: A Dialogue-level Hallucination Evaluation Benchmark for Large Language Models [26.289847386286446]
We propose DiaHalu, the first dialogue-level hallucination evaluation benchmark to our knowledge.
We integrate the collected topics into system prompts and facilitate a dialogue between two ChatGPT3.5.
We manually modify the contents that do not adhere to human language conventions and then have LLMs re-generate, simulating authentic human-machine interaction scenarios.
arXiv Detail & Related papers (2024-03-01T15:38:55Z) - Fine-grained Hallucination Detection and Editing for Language Models [109.56911670376932]
Large language models (LMs) are prone to generate factual errors, which are often called hallucinations.
We introduce a comprehensive taxonomy of hallucinations and argue that hallucinations manifest in diverse forms.
We propose a novel task of automatic fine-grained hallucination detection and construct a new evaluation benchmark, FavaBench.
arXiv Detail & Related papers (2024-01-12T19:02:48Z) - HalluciDoctor: Mitigating Hallucinatory Toxicity in Visual Instruction Data [102.56792377624927]
hallucinations inherent in machine-generated data remain under-explored.
We present a novel hallucination detection and elimination framework, HalluciDoctor, based on the cross-checking paradigm.
Our method successfully mitigates 44.6% hallucinations relatively and maintains competitive performance compared to LLaVA.
arXiv Detail & Related papers (2023-11-22T04:52:58Z) - Towards Mitigating Hallucination in Large Language Models via
Self-Reflection [63.2543947174318]
Large language models (LLMs) have shown promise for generative and knowledge-intensive tasks including question-answering (QA) tasks.
This paper analyses the phenomenon of hallucination in medical generative QA systems using widely adopted LLMs and datasets.
arXiv Detail & Related papers (2023-10-10T03:05:44Z) - AutoHall: Automated Hallucination Dataset Generation for Large Language Models [56.92068213969036]
This paper introduces a method for automatically constructing model-specific hallucination datasets based on existing fact-checking datasets called AutoHall.
We also propose a zero-resource and black-box hallucination detection method based on self-contradiction.
arXiv Detail & Related papers (2023-09-30T05:20:02Z) - Trapping LLM Hallucinations Using Tagged Context Prompts [11.655802601887197]
We propose a novel method to recognize and flag instances when large language models perform outside their domain knowledge.
We find that the use of context combined with embedded tags can successfully combat hallucinations within generative language models.
arXiv Detail & Related papers (2023-06-09T17:48:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.