Contrastive Learning Reduces Hallucination in Conversations
- URL: http://arxiv.org/abs/2212.10400v1
- Date: Tue, 20 Dec 2022 16:26:18 GMT
- Title: Contrastive Learning Reduces Hallucination in Conversations
- Authors: Weiwei Sun, Zhengliang Shi, Shen Gao, Pengjie Ren, Maarten de Rijke,
Zhaochun Ren
- Abstract summary: We propose a contrastive learning scheme, named MixCL.
A novel mixed contrastive objective is proposed to explicitly optimize the implicit knowledge elicitation process of LMs.
We show that MixCL achieves comparable performance to state-of-the-art KB-based approaches.
- Score: 76.55116206021346
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Pre-trained language models (LMs) store knowledge in their parameters and can
generate informative responses when used in conversational systems. However,
LMs suffer from the problem of "hallucination:" they may generate
plausible-looking statements that are irrelevant or factually incorrect. To
address this problem, we propose a contrastive learning scheme, named MixCL. A
novel mixed contrastive objective is proposed to explicitly optimize the
implicit knowledge elicitation process of LMs, and thus reduce their
hallucination in conversations. We also examine negative sampling strategies of
retrieved hard negatives and model-generated negatives. We conduct experiments
on Wizard-of-Wikipedia, a public, open-domain knowledge-grounded dialogue
benchmark, and assess the effectiveness of MixCL. MixCL effectively reduces the
hallucination of LMs in conversations and achieves the highest performance
among LM-based dialogue agents in terms of relevancy and factuality. We show
that MixCL achieves comparable performance to state-of-the-art KB-based
approaches while enjoying notable advantages in terms of efficiency and
scalability.
Related papers
- Analyzing LLM Behavior in Dialogue Summarization: Unveiling Circumstantial Hallucination Trends [38.86240794422485]
We evaluate the faithfulness of large language models for dialogue summarization.
Our evaluation reveals subtleties as to what constitutes a hallucination.
We introduce two prompt-based approaches for fine-grained error detection that outperform existing metrics.
arXiv Detail & Related papers (2024-06-05T17:49:47Z) - Mitigating Object Hallucination via Data Augmented Contrastive Tuning [52.43197107069751]
Multimodal Large Language Models (MLLMs) tend to hallucinate factually inaccurate information.
We introduce a contrastive tuning method that can be applied to a pretrained off-the-shelf MLLM for mitigating hallucinations.
arXiv Detail & Related papers (2024-05-28T23:36:00Z) - Alleviating Hallucinations in Large Vision-Language Models through Hallucination-Induced Optimization [123.54980913741828]
Large Visual Language Models (LVLMs) have demonstrated exceptional abilities in understanding multimodal data.
They invariably suffer from hallucinations, leading to a disconnect between the generated text and the corresponding images.
Almost all current visual contrastive decoding methods attempt to mitigate these hallucinations by introducing visual uncertainty information.
However, they struggle to precisely induce the hallucinatory tokens, which severely limits their effectiveness in mitigating hallucinations.
arXiv Detail & Related papers (2024-05-24T08:46:31Z) - A Cause-Effect Look at Alleviating Hallucination of Knowledge-grounded Dialogue Generation [51.53917938874146]
We propose a possible solution for alleviating the hallucination in KGD by exploiting the dialogue-knowledge interaction.
Experimental results of our example implementation show that this method can reduce hallucination without disrupting other dialogue performance.
arXiv Detail & Related papers (2024-04-04T14:45:26Z) - Hallucination Diversity-Aware Active Learning for Text Summarization [46.00645048690819]
Large Language Models (LLMs) have shown propensity to generate hallucinated outputs, i.e., texts that are factually incorrect or unsupported.
Existing methods for alleviating hallucinations typically require costly human annotations to identify and correct hallucinations in LLM outputs.
We propose the first active learning framework to alleviate LLM hallucinations, reducing costly human annotations of hallucination needed.
arXiv Detail & Related papers (2024-04-02T02:30:27Z) - Retrieve Only When It Needs: Adaptive Retrieval Augmentation for
Hallucination Mitigation in Large Language Models [73.93616728895401]
Hallucinations pose a significant challenge for the practical implementation of large language models (LLMs)
We present Rowen, a novel approach that enhances LLMs with a selective retrieval augmentation process tailored to address hallucinations.
arXiv Detail & Related papers (2024-02-16T11:55:40Z) - Alleviating Hallucinations of Large Language Models through Induced
Hallucinations [67.35512483340837]
Large language models (LLMs) have been observed to generate responses that include inaccurate or fabricated information.
We propose a simple textitInduce-then-Contrast Decoding (ICD) strategy to alleviate hallucinations.
arXiv Detail & Related papers (2023-12-25T12:32:49Z) - Zero-Resource Hallucination Prevention for Large Language Models [45.4155729393135]
"Hallucination" refers to instances where large language models (LLMs) generate factually inaccurate or ungrounded information.
We introduce a novel pre-language self-evaluation technique, referred to as SELF-FAMILIARITY, which focuses on evaluating the model's familiarity with the concepts present in the input instruction.
We validate SELF-FAMILIARITY across four different large language models, demonstrating consistently superior performance compared to existing techniques.
arXiv Detail & Related papers (2023-09-06T01:57:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.