Related papers: DRAGged into Conflicts: Detecting and Addressing Conflicting Sources in Search-Augmented LLMs

DRAGged into Conflicts: Detecting and Addressing Conflicting Sources in Search-Augmented LLMs

URL: http://arxiv.org/abs/2506.08500v1
Date: Tue, 10 Jun 2025 06:52:57 GMT
Title: DRAGged into Conflicts: Detecting and Addressing Conflicting Sources in Search-Augmented LLMs
Authors: Arie Cattan, Alon Jacovi, Ori Ram, Jonathan Herzig, Roee Aharoni, Sasha Goldshtein, Eran Ofek, Idan Szpektor, Avi Caciularu,
Abstract summary: Retrieval Augmented Generation (RAG) is a commonly used approach for enhancing large language models.<n>We propose a novel taxonomy of knowledge conflict types in RAG, along with the desired model behavior for each type.<n>We then introduce CONFLICTS, a high-quality benchmark with expert annotations of conflict types in a realistic RAG setting.
Score: 36.47787866482107
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Retrieval Augmented Generation (RAG) is a commonly used approach for enhancing large language models (LLMs) with relevant and up-to-date information. However, the retrieved sources can often contain conflicting information and it remains unclear how models should address such discrepancies. In this work, we first propose a novel taxonomy of knowledge conflict types in RAG, along with the desired model behavior for each type. We then introduce CONFLICTS, a high-quality benchmark with expert annotations of conflict types in a realistic RAG setting. CONFLICTS is the first benchmark that enables tracking progress on how models address a wide range of knowledge conflicts. We conduct extensive experiments on this benchmark, showing that LLMs often struggle to appropriately resolve conflicts between sources. While prompting LLMs to explicitly reason about the potential conflict in the retrieved documents significantly improves the quality and appropriateness of their responses, substantial room for improvement in future research remains.

Related papers

MAGIC: A Multi-Hop and Graph-Based Benchmark for Inter-Context Conflicts in Retrieval-Augmented Generation [4.177310099979434]
Knowledge conflict often arises in RAG systems, where retrieved documents may be inconsistent with one another or contradict the model's parametric knowledge.<n>We propose a knowledge graph (KG)-based framework that generates varied and subtle conflicts between two similar yet distinct contexts.<n> Experimental results on our benchmark, MAGIC, provide intriguing insights into the inner workings of LLMs regarding knowledge conflict.
arXiv Detail & Related papers (2025-07-29T07:19:49Z)
Towards Agentic RAG with Deep Reasoning: A Survey of RAG-Reasoning Systems in LLMs [69.10441885629787]
Retrieval-Augmented Generation (RAG) lifts the factuality of Large Language Models (LLMs) by injecting external knowledge.<n>It falls short on problems that demand multi-step inference; conversely, purely reasoning-oriented approaches often hallucinate or mis-ground facts.<n>This survey synthesizes both strands under a unified reasoning-retrieval perspective.
arXiv Detail & Related papers (2025-07-13T03:29:41Z)
FaithfulRAG: Fact-Level Conflict Modeling for Context-Faithful Retrieval-Augmented Generation [37.28571879699906]
Large language models (LLMs) augmented with retrieval systems have demonstrated significant potential in handling knowledge-intensive tasks.<n>This paper proposes FaithfulRAG, a novel framework that resolves knowledge conflicts by explicitly modeling discrepancies between the models parametric knowledge and retrieved context.
arXiv Detail & Related papers (2025-06-10T16:02:54Z)
Preference Learning for AI Alignment: a Causal Perspective [55.2480439325792]
We frame this problem in a causal paradigm, providing the rich toolbox of causality to identify persistent challenges.<n>Inheriting from the literature of causal inference, we identify key assumptions necessary for reliable generalisation.<n>We illustrate failure modes of naive reward models and demonstrate how causally-inspired approaches can improve model robustness.
arXiv Detail & Related papers (2025-06-06T10:45:42Z)
Resolving Conflicting Evidence in Automated Fact-Checking: A Study on Retrieval-Augmented LLMs [12.923119372847834]
This paper presents the first systematic evaluation of Retrieval-Augmented Generation (RAG) models for fact-checking.<n>Experiments reveal critical vulnerabilities in state-of-the-art RAG methods, particularly in resolving conflicts stemming from differences in media source credibility.<n>Our results show that effectively incorporating source credibility significantly enhances the ability of RAG models to resolve conflicting evidence and improve fact-checking performance.
arXiv Detail & Related papers (2025-05-23T11:35:03Z)
Retrieval-Augmented Generation with Conflicting Evidence [57.66282463340297]
Large language model (LLM) agents are increasingly employing retrieval-augmented generation (RAG) to improve the factuality of their responses.<n>In practice, these systems often need to handle ambiguous user queries and potentially conflicting information from multiple sources.<n>We propose RAMDocs (Retrieval with Ambiguity and Misinformation in Documents), a new dataset that simulates complex and realistic scenarios for conflicting evidence for a user query.
arXiv Detail & Related papers (2025-04-17T16:46:11Z)
SegSub: Evaluating Robustness to Knowledge Conflicts and Hallucinations in Vision-Language Models [6.52323086990482]
Vision language models (VLM) demonstrate sophisticated multimodal reasoning yet are prone to hallucination when confronted with knowledge conflicts.<n>This research introduces segsub, a framework for applying targeted image perturbations to investigate VLM resilience against knowledge conflicts.
arXiv Detail & Related papers (2025-02-19T00:26:38Z)
Insight Over Sight: Exploring the Vision-Knowledge Conflicts in Multimodal LLMs [55.74117540987519]
This paper explores the problem of commonsense level vision-knowledge conflict in Multimodal Large Language Models (MLLMs)<n>We introduce an automated framework, augmented with human-in-the-loop quality control, to generate inputs designed to simulate and evaluate these conflicts in MLLMs.<n>Using this framework, we have crafted a diagnostic benchmark consisting of 374 original images and 1,122 high-quality question-answer pairs.
arXiv Detail & Related papers (2024-10-10T17:31:17Z)
ECon: On the Detection and Resolution of Evidence Conflicts [56.89209046429291]
The rise of large language models (LLMs) has significantly influenced the quality of information in decision-making systems. This study introduces a method for generating diverse, validated evidence conflicts to simulate real-world misinformation scenarios.
arXiv Detail & Related papers (2024-10-05T07:41:17Z)
Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models [33.76903352835436]
Large Vision-Language Models (LVLMs) have demonstrated impressive capabilities for capturing and reasoning over multimodal inputs. These models are prone to parametric knowledge conflicts, which arise from inconsistencies of represented knowledge between their vision and language components. We present a systematic approach to detect, interpret, and mitigate them.
arXiv Detail & Related papers (2024-10-04T17:59:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.