Related papers: Dehallucinating Parallel Context Extension for Retrieval-Augmented Generation

Dehallucinating Parallel Context Extension for Retrieval-Augmented Generation

URL: http://arxiv.org/abs/2412.14905v1
Date: Thu, 19 Dec 2024 14:37:11 GMT
Title: Dehallucinating Parallel Context Extension for Retrieval-Augmented Generation
Authors: Zexiong Ma, Shengnan An, Zeqi Lin, Yanzhen Zou, Jian-Guang Lou, Bing Xie,
Abstract summary: Large language models (LLMs) are susceptible to generating hallucinated information, despite the integration of retrieval-augmented generation (RAG)<n>In this paper, we propose DePaC, which alleviates the hallucination problem with context-aware negative training and information-calibrated aggregation.
Score: 42.76770979205655
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: Large language models (LLMs) are susceptible to generating hallucinated information, despite the integration of retrieval-augmented generation (RAG). Parallel context extension (PCE) is a line of research attempting to effectively integrating parallel (unordered) contexts, while it still suffers from hallucinations when adapted to RAG scenarios. In this paper, we propose DePaC (Dehallucinating Parallel Context Extension), which alleviates the hallucination problem with context-aware negative training and information-calibrated aggregation. DePaC is designed to alleviate two types of in-context hallucination: fact fabrication (i.e., LLMs present claims that are not supported by the contexts) and fact omission (i.e., LLMs fail to present claims that can be supported by the contexts). Specifically, (1) for fact fabrication, we apply the context-aware negative training that fine-tunes the LLMs with negative supervisions, thus explicitly guiding the LLMs to refuse to answer when contexts are not related to questions; (2) for fact omission, we propose the information-calibrated aggregation which prioritizes context windows with higher information increment from their contexts. The experimental results on nine RAG tasks demonstrate that DePaC significantly alleviates the two types of hallucination and consistently achieves better performances on these tasks.

Related papers

Beyond Facts: Evaluating Intent Hallucination in Large Language Models [13.315302240710164]
FAITHQA is a novel benchmark for intent hallucination that contains 20,068 problems.<n>We find that intent hallucination is a common issue even for state-of-the-art models.<n>We introduce an automatic LLM generation evaluation metric, CONSTRAINT SCORE, for detecting intent hallucination.
arXiv Detail & Related papers (2025-06-06T21:10:55Z)
Span-Level Hallucination Detection for LLM-Generated Answers [0.0]
This paper presents a span-level hallucination detection framework for the SemEval-2025 Shared Task, focusing on English and Arabic texts. Our approach integrates Semantic Role Labeling (SRL) to decompose the answer into atomic roles, which are then compared with a retrieved reference context. Using a DeBERTa-based textual entailment model, we evaluate each role semantic alignment with the retrieved context.
arXiv Detail & Related papers (2025-04-25T18:35:25Z)
HalluLens: LLM Hallucination Benchmark [49.170128733508335]
Large language models (LLMs) often generate responses that deviate from user input or training data, a phenomenon known as "hallucination" This paper introduces a comprehensive hallucination benchmark, incorporating both new extrinsic and existing intrinsic evaluation tasks.
arXiv Detail & Related papers (2025-04-24T13:40:27Z)
Hallucination Detection: Robustly Discerning Reliable Answers in Large Language Models [70.19081534515371]
Large Language Models (LLMs) have gained widespread adoption in various natural language processing tasks. They generate unfaithful or inconsistent content that deviates from the input source, leading to severe consequences. We propose a robust discriminator named RelD to effectively detect hallucination in LLMs' generated answers.
arXiv Detail & Related papers (2024-07-04T18:47:42Z)
RAGged Edges: The Double-Edged Sword of Retrieval-Augmented Chatbots [6.893551641325889]
ChatGPT's tendency to hallucinate -- generate plausible but false information -- poses a significant challenge. This paper explores how Retrieval-Augmented Generation can counter hallucinations by integrating external knowledge with prompts. Our results show that RAG increases accuracy in some cases, but can still be misled when prompts directly contradict the model's pre-trained understanding.
arXiv Detail & Related papers (2024-03-02T12:19:04Z)
Retrieve Only When It Needs: Adaptive Retrieval Augmentation for Hallucination Mitigation in Large Language Models [68.91592125175787]
Hallucinations pose a significant challenge for the practical implementation of large language models (LLMs) We present Rowen, a novel approach that enhances LLMs with a selective retrieval augmentation process tailored to address hallucinations.
arXiv Detail & Related papers (2024-02-16T11:55:40Z)
Alleviating Hallucinations of Large Language Models through Induced Hallucinations [67.35512483340837]
Large language models (LLMs) have been observed to generate responses that include inaccurate or fabricated information. We propose a simple textitInduce-then-Contrast Decoding (ICD) strategy to alleviate hallucinations.
arXiv Detail & Related papers (2023-12-25T12:32:49Z)
Contrastive Learning Reduces Hallucination in Conversations [76.55116206021346]
We propose a contrastive learning scheme, named MixCL. A novel mixed contrastive objective is proposed to explicitly optimize the implicit knowledge elicitation process of LMs. We show that MixCL achieves comparable performance to state-of-the-art KB-based approaches.
arXiv Detail & Related papers (2022-12-20T16:26:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.