InteGround: On the Evaluation of Verification and Retrieval Planning in Integrative Grounding
- URL: http://arxiv.org/abs/2509.16534v1
- Date: Sat, 20 Sep 2025 04:48:24 GMT
- Title: InteGround: On the Evaluation of Verification and Retrieval Planning in Integrative Grounding
- Authors: Cheng Jiayang, Qianqian Zhuang, Haoran Li, Chunkit Chan, Xin Liu, Lin Qiu, Yangqiu Song,
- Abstract summary: We introduce "integrative grounding" -- the challenge of retrieving and verifying multiple inter-dependent pieces of evidence to support a hypothesis query.<n>Our investigation reveals two critical findings: First, in groundedness verification, while LLMs are robust to redundant evidence, they tend to rationalize using internal knowledge when information is incomplete.<n>We find that undirected planning can degrade performance through noise introduction, while premise abduction emerges as a promising approach due to its logical constraints.
- Score: 51.80327078402434
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Grounding large language models (LLMs) in external knowledge sources is a promising method for faithful prediction. While existing grounding approaches work well for simple queries, many real-world information needs require synthesizing multiple pieces of evidence. We introduce "integrative grounding" -- the challenge of retrieving and verifying multiple inter-dependent pieces of evidence to support a hypothesis query. To systematically study this problem, we repurpose data from four domains for evaluating integrative grounding capabilities. Our investigation reveals two critical findings: First, in groundedness verification, while LLMs are robust to redundant evidence, they tend to rationalize using internal knowledge when information is incomplete. Second, in examining retrieval planning strategies, we find that undirected planning can degrade performance through noise introduction, while premise abduction emerges as a promising approach due to its logical constraints. Additionally, LLMs' zero-shot self-reflection capabilities consistently improve grounding quality. These insights provide valuable direction for developing more effective integrative grounding systems.
Related papers
- Leveraging LLM Parametric Knowledge for Fact Checking without Retrieval [60.25608870901428]
Trustworthiness is a core research challenge for agentic AI systems built on Large Language Models (LLMs)<n>We propose the task of fact-checking without retrieval, focusing on the verification of arbitrary natural language claims, independent of their source robustness.
arXiv Detail & Related papers (2026-03-05T18:42:51Z) - When More Is Less: A Systematic Analysis of Spatial and Commonsense Information for Visual Spatial Reasoning [14.364968874702107]
We conduct a hypothesis-driven analysis of information injection for visual spatial reasoning.<n>We find that more information does not necessarily yield better reasoning.<n>These findings highlight the importance of selective, task-aligned information injection.
arXiv Detail & Related papers (2026-02-25T06:22:48Z) - Hallucination Detection and Mitigation in Large Language Models [0.0]
Large Language Models (LLMs) and Large Reasoning Models (LRMs) offer transformative potential for high-stakes domains like finance and law.<n>Their tendency to hallucinate, generating factually incorrect or unsupported content, poses a critical reliability risk.<n>This paper introduces a comprehensive framework for hallucination management, built on a continuous improvement cycle driven by root cause awareness.
arXiv Detail & Related papers (2026-01-14T23:19:37Z) - Multi-hop Reasoning via Early Knowledge Alignment [68.28168992785896]
Early Knowledge Alignment (EKA) aims to align Large Language Models with contextually relevant retrieved knowledge.<n>EKA significantly improves retrieval precision, reduces cascading errors, and enhances both performance and efficiency.<n>EKA proves effective as a versatile, training-free inference strategy that scales seamlessly to large models.
arXiv Detail & Related papers (2025-12-23T08:14:44Z) - VAR: Visual Attention Reasoning via Structured Search and Backtracking [49.427842994857635]
We introduce Visual Attention Reasoning, a framework that recasts grounded reasoning as a structured search.<n> VAR decomposes the reasoning process into two key stages: traceable evidence grounding and search-based chain-of-thought.<n>We show that our 7B model, VAR-7B, sets a new state-of-the-art on a comprehensive suite of hallucination and safety benchmarks.
arXiv Detail & Related papers (2025-10-21T13:18:44Z) - Towards Agentic RAG with Deep Reasoning: A Survey of RAG-Reasoning Systems in LLMs [69.10441885629787]
Retrieval-Augmented Generation (RAG) lifts the factuality of Large Language Models (LLMs) by injecting external knowledge.<n>It falls short on problems that demand multi-step inference; conversely, purely reasoning-oriented approaches often hallucinate or mis-ground facts.<n>This survey synthesizes both strands under a unified reasoning-retrieval perspective.
arXiv Detail & Related papers (2025-07-13T03:29:41Z) - mRAG: Elucidating the Design Space of Multi-modal Retrieval-Augmented Generation [15.991125806837386]
Large Vision-Language Models (LVLMs) have made remarkable strides in multimodal tasks such as visual question answering, visual grounding, and complex reasoning.<n>Retrieval-Augmented Generation (RAG) offers a practical solution to mitigate these challenges by allowing the LVLMs to access large-scale knowledge databases via retrieval mechanisms.
arXiv Detail & Related papers (2025-05-29T23:32:03Z) - Divide-Then-Align: Honest Alignment based on the Knowledge Boundary of RAG [51.120170062795566]
We propose Divide-Then-Align (DTA) to endow RAG systems with the ability to respond with "I don't know" when the query is out of the knowledge boundary.<n>DTA balances accuracy with appropriate abstention, enhancing the reliability and trustworthiness of retrieval-augmented systems.
arXiv Detail & Related papers (2025-05-27T08:21:21Z) - Multi-Source Knowledge Pruning for Retrieval-Augmented Generation: A Benchmark and Empirical Study [46.55831783809377]
Retrieval-augmented generation (RAG) is increasingly recognized as an effective approach to mitigating the hallucination of large language models (LLMs)<n>We develop PruningRAG, a plug-and-play RAG framework that uses multi-granularity pruning strategies to more effectively incorporate relevant context and mitigate the negative impact of misleading information.
arXiv Detail & Related papers (2024-09-03T03:31:37Z) - Mind the Interference: Retaining Pre-trained Knowledge in Parameter Efficient Continual Learning of Vision-Language Models [79.28821338925947]
Domain-Class Incremental Learning is a realistic but challenging continual learning scenario.
To handle these diverse tasks, pre-trained Vision-Language Models (VLMs) are introduced for their strong generalizability.
This incurs a new problem: the knowledge encoded in the pre-trained VLMs may be disturbed when adapting to new tasks, compromising their inherent zero-shot ability.
Existing methods tackle it by tuning VLMs with knowledge distillation on extra datasets, which demands heavy overhead.
We propose the Distribution-aware Interference-free Knowledge Integration (DIKI) framework, retaining pre-trained knowledge of
arXiv Detail & Related papers (2024-07-07T12:19:37Z) - Multi-modal Causal Structure Learning and Root Cause Analysis [67.67578590390907]
We propose Mulan, a unified multi-modal causal structure learning method for root cause localization.
We leverage a log-tailored language model to facilitate log representation learning, converting log sequences into time-series data.
We also introduce a novel key performance indicator-aware attention mechanism for assessing modality reliability and co-learning a final causal graph.
arXiv Detail & Related papers (2024-02-04T05:50:38Z) - How Well Do Large Language Models Truly Ground? [39.39062385290276]
A common method is to generate responses by grounding on external contexts given as input, known as knowledge-augmented models.
Previous research often narrowly defines "grounding" as just having the correct answer, which does not ensure the reliability of the entire response.
We propose a stricter definition of grounding: a model is truly grounded if it (1) fully utilizes the necessary knowledge from the provided context, and (2) stays within the limits of that knowledge.
arXiv Detail & Related papers (2023-11-15T16:11:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.