Related papers: Failure is Feedback: History-Aware Backtracking for Agentic Traversal in Multimodal Graphs

Failure is Feedback: History-Aware Backtracking for Agentic Traversal in Multimodal Graphs

URL: http://arxiv.org/abs/2602.03432v1
Date: Tue, 03 Feb 2026 11:54:38 GMT
Title: Failure is Feedback: History-Aware Backtracking for Agentic Traversal in Multimodal Graphs
Authors: Joohyung Yun, Doyup Lee, Wook-Shin Han,
Abstract summary: Open-domain multimodal document retrieval aims to retrieve specific components from large and interconnected document corpora.<n>Existing graph-based retrieval approaches rely on a uniform similarity metric that overlooks hop-specific semantics.<n>We propose Failure is Feedback (FiF), which casts subgraph retrieval as a sequential decision process.<n>FiF achieves state-of-the-art retrieval on the benchmarks of MultimodalQA, MMCoQA and WebQA.
Score: 13.855117422052315
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Open-domain multimodal document retrieval aims to retrieve specific components (paragraphs, tables, or images) from large and interconnected document corpora. Existing graph-based retrieval approaches typically rely on a uniform similarity metric that overlooks hop-specific semantics, and their rigid pre-defined plans hinder dynamic error correction. These limitations suggest that a retriever should adapt its reasoning to the evolving context and recover intelligently from dead ends. To address these needs, we propose Failure is Feedback (FiF), which casts subgraph retrieval as a sequential decision process and introduces two key innovations. (i) We introduce a history-aware backtracking mechanism; unlike standard backtracking that simply reverts the state, our approach piggybacks on the context of failed traversals, leveraging insights from previous failures. (ii) We implement an economically-rational agentic workflow. Unlike conventional agents with static strategies, our orchestrator employs a cost-aware traversal method to dynamically manage the trade-off between retrieval accuracy and inference costs, escalating to intensive LLM-based reasoning only when the prior failure justifies the additional computational investment. Extensive experiments show that FiF achieves state-of-the-art retrieval on the benchmarks of MultimodalQA, MMCoQA and WebQA.

Related papers

OMG-Agent: Toward Robust Missing Modality Generation with Decoupled Coarse-to-Fine Agentic Workflows [9.617220633655716]
We present textbfunderlineOmni-textbfunderlineModality textbfunderlineGeneration Agent (textbfOMG-Agent)
arXiv Detail & Related papers (2026-02-04T02:25:40Z)
Search-R2: Enhancing Search-Integrated Reasoning via Actor-Refiner Collaboration [49.9937230730202]
We propose Search-R2, a novel Actor-Refiner collaboration framework that enhances reasoning through targeted intervention.<n>Our approach decomposes the generation process into an Actor, which produces initial reasoning trajectories.<n>We show that Search-R2 consistently outperforms strong RAG and RL-based baselines across model scales.
arXiv Detail & Related papers (2026-02-03T15:32:09Z)
Beyond Error-Based Optimization: Experience-Driven Symbolic Regression with Goal-Conditioned Reinforcement Learning [14.473539776112666]
We propose a novel framework named EGRL-SR (Experience-driven Goal-conditioned Reinforcement Learning for Regression)<n>We formulate symbolic regression as a goal-conditioned reinforcement learning problem and incorporate hindsight experience replay.<n>We design an all-point satisfaction binary reward function that encourages the action-value network to focus on structural patterns rather than low-error expressions.
arXiv Detail & Related papers (2026-01-21T06:08:37Z)
SoliReward: Mitigating Susceptibility to Reward Hacking and Annotation Noise in Video Generation Reward Models [53.19726629537694]
Post-training alignment of video generation models with human preferences is a critical goal.<n>Current data collection paradigms, reliant on in-prompt pairwise annotations, suffer from labeling noise.<n>We propose SoliReward, a systematic framework for video RM training.
arXiv Detail & Related papers (2025-12-17T14:28:23Z)
Adaptive Multi-Agent Reasoning for Text-to-Video Retrieval [12.701443847087164]
We propose an adaptive multi-agent retrieval framework that orchestrates specialized agents over multiple reasoning iterations.<n>Our framework achieves a twofold improvement over CLIP4Clip and significantly outperforms state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2025-12-02T09:52:51Z)
VAR: Visual Attention Reasoning via Structured Search and Backtracking [49.427842994857635]
We introduce Visual Attention Reasoning, a framework that recasts grounded reasoning as a structured search.<n> VAR decomposes the reasoning process into two key stages: traceable evidence grounding and search-based chain-of-thought.<n>We show that our 7B model, VAR-7B, sets a new state-of-the-art on a comprehensive suite of hallucination and safety benchmarks.
arXiv Detail & Related papers (2025-10-21T13:18:44Z)
Just-in-time Episodic Feedback Hinter: Leveraging Offline Knowledge to Improve LLM Agents Adaptation [77.90555621662345]
We present JEF Hinter, an agentic system that distills offline traces into compact, context-aware hints.<n>A zooming mechanism highlights decisive steps in long trajectories, capturing both strategies and pitfalls.<n>Experiments on MiniWoB++, WorkArena-L1, and WebArena-Lite show that JEF Hinter consistently outperforms strong baselines.
arXiv Detail & Related papers (2025-10-05T21:34:42Z)
Rationale-Augmented Retrieval with Constrained LLM Re-Ranking for Task Discovery [4.061135251278187]
Head Start programs utilizing GoEngage face significant challenges when new or rotating staff attempt to locate appropriate Tasks on the platform homepage.<n>These difficulties arise from domain-specific jargon, system-specific nomenclature, and the inherent limitations of lexical search in handling typos and varied word ordering.<n>We propose a pragmatic hybrid semantic search system that combines lightweight typo-tolerant lexical retrieval, embedding-based vector similarity, and constrained large language model (LLM) re-ranking.
arXiv Detail & Related papers (2025-10-01T01:28:59Z)
Retro*: Optimizing LLMs for Reasoning-Intensive Document Retrieval [44.680580989270965]
Retro* is a novel approach for reasoning-intensive document retrieval.<n>We introduce a rubric-based relevance scoring mechanism, enabling the model to reason about the relationship between a task and a document.<n>Our experiments show that Retro* outperforms existing document retrieval methods with notable advantages.
arXiv Detail & Related papers (2025-09-29T14:53:05Z)
DAMR: Efficient and Adaptive Context-Aware Knowledge Graph Question Answering with LLM-Guided MCTS [28.828541350757714]
This paper proposes Dynamically Adaptive MCTS-based Reasoning (DAMR) for Knowledge Graph Question Answering (KGQA)<n>DAMR integrates Monte Carlo Tree Search (MCTS) with adaptive path evaluation to enable context-aware KGQA.<n>Experiments on multiple KGQA benchmarks show DAMR significantly outperforms SOTA methods.
arXiv Detail & Related papers (2025-08-01T15:38:21Z)
Constrained Auto-Regressive Decoding Constrains Generative Retrieval [71.71161220261655]
Generative retrieval seeks to replace traditional search index data structures with a single large-scale neural network.<n>In this paper, we examine the inherent limitations of constrained auto-regressive generation from two essential perspectives: constraints and beam search.
arXiv Detail & Related papers (2025-04-14T06:54:49Z)
iEBAKER: Improved Remote Sensing Image-Text Retrieval Framework via Eliminate Before Align and Keyword Explicit Reasoning [80.44805667907612]
iEBAKER is an innovative strategy to filter weakly correlated sample pairs.<n>We introduce an alternative Sort After Reversed Retrieval (SAR) strategy.<n>We incorporate a Keyword Explicit Reasoning (KER) module to facilitate the beneficial impact of subtle key concept distinctions.
arXiv Detail & Related papers (2025-04-08T03:40:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.