Failure is Feedback: History-Aware Backtracking for Agentic Traversal in Multimodal Graphs
- URL: http://arxiv.org/abs/2602.03432v1
- Date: Tue, 03 Feb 2026 11:54:38 GMT
- Title: Failure is Feedback: History-Aware Backtracking for Agentic Traversal in Multimodal Graphs
- Authors: Joohyung Yun, Doyup Lee, Wook-Shin Han,
- Abstract summary: Open-domain multimodal document retrieval aims to retrieve specific components from large and interconnected document corpora.<n>Existing graph-based retrieval approaches rely on a uniform similarity metric that overlooks hop-specific semantics.<n>We propose Failure is Feedback (FiF), which casts subgraph retrieval as a sequential decision process.<n>FiF achieves state-of-the-art retrieval on the benchmarks of MultimodalQA, MMCoQA and WebQA.
- Score: 13.855117422052315
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Open-domain multimodal document retrieval aims to retrieve specific components (paragraphs, tables, or images) from large and interconnected document corpora. Existing graph-based retrieval approaches typically rely on a uniform similarity metric that overlooks hop-specific semantics, and their rigid pre-defined plans hinder dynamic error correction. These limitations suggest that a retriever should adapt its reasoning to the evolving context and recover intelligently from dead ends. To address these needs, we propose Failure is Feedback (FiF), which casts subgraph retrieval as a sequential decision process and introduces two key innovations. (i) We introduce a history-aware backtracking mechanism; unlike standard backtracking that simply reverts the state, our approach piggybacks on the context of failed traversals, leveraging insights from previous failures. (ii) We implement an economically-rational agentic workflow. Unlike conventional agents with static strategies, our orchestrator employs a cost-aware traversal method to dynamically manage the trade-off between retrieval accuracy and inference costs, escalating to intensive LLM-based reasoning only when the prior failure justifies the additional computational investment. Extensive experiments show that FiF achieves state-of-the-art retrieval on the benchmarks of MultimodalQA, MMCoQA and WebQA.
Related papers
- OMG-Agent: Toward Robust Missing Modality Generation with Decoupled Coarse-to-Fine Agentic Workflows [9.617220633655716]
We present textbfunderlineOmni-textbfunderlineModality textbfunderlineGeneration Agent (textbfOMG-Agent)
arXiv Detail & Related papers (2026-02-04T02:25:40Z) - Search-R2: Enhancing Search-Integrated Reasoning via Actor-Refiner Collaboration [49.9937230730202]
We propose Search-R2, a novel Actor-Refiner collaboration framework that enhances reasoning through targeted intervention.<n>Our approach decomposes the generation process into an Actor, which produces initial reasoning trajectories.<n>We show that Search-R2 consistently outperforms strong RAG and RL-based baselines across model scales.
arXiv Detail & Related papers (2026-02-03T15:32:09Z) - Beyond Error-Based Optimization: Experience-Driven Symbolic Regression with Goal-Conditioned Reinforcement Learning [14.473539776112666]
We propose a novel framework named EGRL-SR (Experience-driven Goal-conditioned Reinforcement Learning for Regression)<n>We formulate symbolic regression as a goal-conditioned reinforcement learning problem and incorporate hindsight experience replay.<n>We design an all-point satisfaction binary reward function that encourages the action-value network to focus on structural patterns rather than low-error expressions.
arXiv Detail & Related papers (2026-01-21T06:08:37Z) - SoliReward: Mitigating Susceptibility to Reward Hacking and Annotation Noise in Video Generation Reward Models [53.19726629537694]
Post-training alignment of video generation models with human preferences is a critical goal.<n>Current data collection paradigms, reliant on in-prompt pairwise annotations, suffer from labeling noise.<n>We propose SoliReward, a systematic framework for video RM training.
arXiv Detail & Related papers (2025-12-17T14:28:23Z) - Adaptive Multi-Agent Reasoning for Text-to-Video Retrieval [12.701443847087164]
We propose an adaptive multi-agent retrieval framework that orchestrates specialized agents over multiple reasoning iterations.<n>Our framework achieves a twofold improvement over CLIP4Clip and significantly outperforms state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2025-12-02T09:52:51Z) - VAR: Visual Attention Reasoning via Structured Search and Backtracking [49.427842994857635]
We introduce Visual Attention Reasoning, a framework that recasts grounded reasoning as a structured search.<n> VAR decomposes the reasoning process into two key stages: traceable evidence grounding and search-based chain-of-thought.<n>We show that our 7B model, VAR-7B, sets a new state-of-the-art on a comprehensive suite of hallucination and safety benchmarks.
arXiv Detail & Related papers (2025-10-21T13:18:44Z) - Just-in-time Episodic Feedback Hinter: Leveraging Offline Knowledge to Improve LLM Agents Adaptation [77.90555621662345]
We present JEF Hinter, an agentic system that distills offline traces into compact, context-aware hints.<n>A zooming mechanism highlights decisive steps in long trajectories, capturing both strategies and pitfalls.<n>Experiments on MiniWoB++, WorkArena-L1, and WebArena-Lite show that JEF Hinter consistently outperforms strong baselines.
arXiv Detail & Related papers (2025-10-05T21:34:42Z) - Rationale-Augmented Retrieval with Constrained LLM Re-Ranking for Task Discovery [4.061135251278187]
Head Start programs utilizing GoEngage face significant challenges when new or rotating staff attempt to locate appropriate Tasks on the platform homepage.<n>These difficulties arise from domain-specific jargon, system-specific nomenclature, and the inherent limitations of lexical search in handling typos and varied word ordering.<n>We propose a pragmatic hybrid semantic search system that combines lightweight typo-tolerant lexical retrieval, embedding-based vector similarity, and constrained large language model (LLM) re-ranking.
arXiv Detail & Related papers (2025-10-01T01:28:59Z) - Retro*: Optimizing LLMs for Reasoning-Intensive Document Retrieval [44.680580989270965]
Retro* is a novel approach for reasoning-intensive document retrieval.<n>We introduce a rubric-based relevance scoring mechanism, enabling the model to reason about the relationship between a task and a document.<n>Our experiments show that Retro* outperforms existing document retrieval methods with notable advantages.
arXiv Detail & Related papers (2025-09-29T14:53:05Z) - DAMR: Efficient and Adaptive Context-Aware Knowledge Graph Question Answering with LLM-Guided MCTS [28.828541350757714]
This paper proposes Dynamically Adaptive MCTS-based Reasoning (DAMR) for Knowledge Graph Question Answering (KGQA)<n>DAMR integrates Monte Carlo Tree Search (MCTS) with adaptive path evaluation to enable context-aware KGQA.<n>Experiments on multiple KGQA benchmarks show DAMR significantly outperforms SOTA methods.
arXiv Detail & Related papers (2025-08-01T15:38:21Z) - Constrained Auto-Regressive Decoding Constrains Generative Retrieval [71.71161220261655]
Generative retrieval seeks to replace traditional search index data structures with a single large-scale neural network.<n>In this paper, we examine the inherent limitations of constrained auto-regressive generation from two essential perspectives: constraints and beam search.
arXiv Detail & Related papers (2025-04-14T06:54:49Z) - iEBAKER: Improved Remote Sensing Image-Text Retrieval Framework via Eliminate Before Align and Keyword Explicit Reasoning [80.44805667907612]
iEBAKER is an innovative strategy to filter weakly correlated sample pairs.<n>We introduce an alternative Sort After Reversed Retrieval (SAR) strategy.<n>We incorporate a Keyword Explicit Reasoning (KER) module to facilitate the beneficial impact of subtle key concept distinctions.
arXiv Detail & Related papers (2025-04-08T03:40:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.